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The security of messages encoded via the widely used RSA public key encryption system rests 
on the enormous computational effort required to find the prime factors of a large number A'^ using 
classical (i.e., conventional) computers. In 1994, however, Peter Shor showed that for sufficiently 
large N a quantum computer would be expected to perform the factoring with much less computa- 
tional effort. This paper endeavors to explain, in a fashion comprehensible to the non-expert readers 
of this journal: (i) the RSA encryption protocol; (ii) the various quantum computer manipulations 
constituting the Shor algorithm; (iii) how the Shor algorithm performs the factoring; and (iv) the 
precise sense in which a quantum computer employing Shor's algorithm can be said to accomplish 
the factoring of very large numbers with less computational effort than a classical computer can. It 
^ ■ is made apparent that factoring A'' generally requires many successive runs of the algorithm. The 

, ^ ' careful analysis herein reveals, however, that the probability of achieving a successful factorization 

, on a single run is about twice as large as commonly quoted in the literature. 
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I. INTRODUCTION. 



Recently published papers in this journal^'^ have attempted to explain how a quantum computer differs from a 
classical (i.e., a conventional) computer. However neither of these papers offers a detailed discussion of the factoring 
algorithm developed by Peter Shor^ in 1994 although this algorithm, whose implementation could frustrate one of 
y—i ' the most widely used modern methods of encrypting messages, provides the most impressive known illustration of 
T-H : the increased computing power potentially attainable with quantum computers. The present paper seeks to fill the 
gap, namely it seeks to furnish, self-contained in this journal's pages but with full citations to the relevant literature, 
; a comprehensible explanation of Shor's algorithm, as well as of the algorithm's relevance to modern cryptography. In 
' so stating I am aware that although Shor's original paper^ and other available publications providing full discussions 
of the Shor algorithm typically are written for quantum computing specialists*, less technical presentations^ more 
^Jj ' suitable for the non-specialist readers of this journal can be found. I additionally remark that various Internet 
Ch ' websites^ post links to a wide variety of publications in the quantum computing literature, organized under numerous 
suitable headings including Shor's Algorithm. 

I now summarize the contents of this paper. The immediately following Section II first describes the basic elements 
of classical cryptography, wherein keys are employed to encipher and/or decipher messages in order to prevent those 
messages from being read by anyone other than their intended audience. Section II then explains in detail, and also 
explicitly illustrates, the enciphering and deciphering procedures in the so-called RSA systemJ , an important modern 
5^] ■ scheme for sending secret messages. These explanations and illustrations necessarily involve presentations of the 
Ci ' number theory underlying the RSA system; without some understanding of this underlying number theory the ability 
of the RSA system to transmit secure messages seems magical. Section II goes on to explain how critically the security 
of the RSA system depends on the fact that using classical computers to factor large numbers requires huge outlays 
of computer resources and time; the relevance of Shor's factoring algorithm to the security of the RSA system thereby 
becomes evident. The next Section III, after briefly summarizing the relevant properties of quantum computers, fully 
describes and illustrates Shor's algorithm. Section III also explains the precise sense in which a quantum computer 
employing Shor's algorithm can be said to accomplish the factoring of very large numbers with less computational 
effort than a classical computer can. To avoid unduly interrupting the flow of the discussion, many details of the 
underlying number theory, which are important not only for understanding why the RSA system works but also for 
comprehending how Shor's algorithm enables the factorization of large numbers, are relegated to the final Section IV 
which serves as an Appendix. 

Before closing this Introduction I emphasize that except for its discussion of the Shor algorithm, which is designed 
for a quantum computer, this paper is concerned solely with classical computers. In particular I assume that the RSA 
system enciphering and deciphering described in Section II are performed with classical computers, as experience has 
demonstrated is practical with numbers of the size presently being used as RSA keys. The possible use of quantum 
computers to perform such enciphering and deciphering, and any other aspects of quantum cryptography and/or 
computation aside from Shor's algorithm, are beyond the scope of this paper. Also beyond the scope of this paper 
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are the difficulties, observed and anticipated, involved in actually constructing functioning quantum coniputers; such 
difficulties are amply discussed in the literature, e.g., in various chapters of Nielsen and Chuang.^ 

II. CRYPTOGRAPHY, KEY DISTRIBUTION AND NUMBER THEORY. 

As Ekcrt^ observes, "Human desire to communicate secretly is at least as old as writing itself and goes back to 
the beginnings of our civilization." The full history of secret communication until about 1965 is recounted by Kahn^; 
developments after about 1965, including those advances in secret communication to which Shor's algorithm pertains, 
are described by Singh^°, who also (but less fully than Kahn) reviews the pre-1965 history. Kahn^ carefully defines the 
terms plaintext (the original uncoded message), cryptogram (a writing in code, e.g., the enciphered message), key (the 
information or system employed to encipher the plaintext message) , cryptography (the acts of enciphering the plaintext 
into a cryptogram, and/or of deciphering the cryptogram by someone who knows the key), and cryptanalysis (the 
art/science of code breaking, i.e., of ferreting out the key from the cryptogram); this paper adopts Kahn's terminology. 

Until about 1975 cryptographers employed so-called symmetric or private (also known as secret) key systems only.^^ 
In such systems the key used by Alice (conventionally the sender of the cryptogram) to encipher the message is the 
same as the key Bob (conventionally its receiver) employs to decipher the cryptogram. One of the simplest type 
of symmetric keys (which also appears to have been the earliest type to be employed, dating back to nearly 2000 
B.C.^^) is termed substitution. In substitution-key cryptography the cryptogram is constructed from the plaintext by 
replacing each letter (of the alphabet) in the plaintext with some other chosen expression; the replacement expression 
can be another letter of the alphabet, or a symbol of some kind, or an arbitrary combination of letters and symbols; 
the same letter of the alphabet in different portions of the plaintext may be replaced by different expressions; the 
key is the chosen replacement scheme. In the very simplest substitution keys, which for the purpose of this paper 
may be termed unique substitution, Alice and Bob agree that the replacements will be unique and one to one, i.e., 
(i) that any given letter in the plaintext always is replaced by the same expression, and (ii) that different letters of 
the alphabet arc replaced by different expressions. Though cryptograms constructed via unique substitution keys 
can seem impregnable, especially when the key involves unusual or unfamiliar symbols, they actually are readily 
deciphered taking advantage of the peculiarities of the language (assumedly known or correctly guessed) in which the 
plaintext had been written, as 15th century Arab cryptologists already knew.^'' The writings of Edgar Allan Poe^** and 
Arthur Conan Doylc"'^^ provide celebrated fictional cryptanalyses of unique substitution cryptograms (in these cases 
with English plaintexts) wherein letters of the alphabet had been replaced by symbols. In the most transparent unique 
substitution cryptograms a single alphabet letter is substituted for each plaintext letter, consistent with a preselected 
so-called cipher alphabet; cryptograms constructed in this fashion, but employing a different cipher alphabet each day, 
are regularly published in many daily newspapers'^^ as puzzles to be deciphered by the newspaper's readers using, 
e.g., the fact that in English the letter e is by far the most frequent. 

But substitution cryptograms need not be constructed with unique keys; moreover nonunique substitution cryp- 
tograms can be and have been made very difficult to cryptanalyse. A famous illustration of this last assertion is 
provided by the Enigma machine employed by the German army during World War II, which constructed cryp- 
tograms wherein each letter was replaced by a single letter as in newspaper cryptograms, but wherein the cipher 
alphabet employed to encipher any given plaintext varic^d not merely from day to day, but also from one plaintext 
letter to the next, in accordance with a predetermined randomly selected complicated kcy.^* Whether or not very 
difficult to cryptanalyse, however, all substitution and other symmetric key cryptographic systems have a deficiency 
known as the key distribution problem, as numerous authors have observed^^: Before Alice and Bob can begin ex- 
changing hopefully non-cryptanalysable cryptograms, they must exchange-in a non-encrypted form-the information 
necessary to establish their key; they cannot be confident this information exchange has not been intercepted unless 
the exchange takes place in the same room, and perhaps not even then.^" Symmetric key cryptographic systems also 
have the related deficiency that if Bob wants to receive secret messages from more than one Alice, then he either must 
set up different keys with each Alice or else risk the possibility that one Alice will intercept and readily decipher a 
message sent Bob by another Alice. 

These deficiencies are avoided in asymmetric (commonly termed public) key systems, wherein the key Alice employs 
to encipher her message is not the same as the key Bob employs to decipher the cryptogram he receives. The differences 
between symmetric and asymmetric key systems can be visualized in terms of a safe: With a symmetric system the 
key Alice employs to open the safe and lock her message inside it is the same as the key Bob employs to open the safe 
and remove the message. With an asymmetric system Alice's key enables her to open the safe just enough to insert 
her message, but no more; only Bob's key can open the safe's door sufficiently widely to permit message removal. 
Indeed in the RSA system'', one of the most commonly employed public key systems, Alice's enciphering key is made 
public, i.e., is not at all secret but rather is available to any Alice who wishes to send Bob an encrypted message; 
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Bob's deciphering key remains his secret. The RSA system is detailed immediately below. 

A. The RSA Public Key System. 

.Bob creates his RSA public key system in the following fashion'': He first selects two different large prime numbers 
p and q, and then computes their product N = pq. Next he also computes the product 4> = (p-l)(q-l) = N +l-(p+q), 
and then selects a positive integer e which is coprimc to (p (meaning e and (j) have no common prime factors other 
than 1). Finally he c;omputes a positive integer d such that the product L = de has the remainder unity when divided 
by (p. Bob now has all the required components of his public key system, except that because the procedure by which 
messages are encrypted in his system involves arithmetical manipulations of integers (as I am about to explain), the 
system also must include some specified convenicnt-to-usc symmetric key whereby any Alice can convert her plaintext 
into a cryptogram consisting of a sequence C of positive integers c, which sequence she then will further encrypt (via 
Bob's proclaimed procedure) into the sequence S of positive integers s actually sent Bob. 

The non-secret components of Bob's public key system, which Bob now is ready to broadcast for the benefit of one 
and all, are'': (i) the positive integers A'^and e; which herein will be termed the key number and encryption exponent 
respectively; (ii) the details of the symmetric key Alice will be using to construct her C, and which Bob also will 
use to reconstruct Alice's original plaintext message once he has deciphered S and thereby recovered C (the only 
restriction on C is that every c must be less than N) ; and (iii) the surprisingly simple procedure for constructing the 
elements s of S* from the elements c of C, namely s is the integer remainder when is divided by N. Note that the 
symmetric key which Alice and Bob share now is completely public; there is no attempt whatsoever to keep it secret. 
Similarly Alice transmits her finally enciphered cryptogram S to Bob via perfectly open communication channels, 
e.g., by email. The secret procedure by which Bob extracts the original C from S parallels, but is not the direct 
inverse of, the public procedure which constructed S from C, namely' for each s Bob computes the integer u which is 
the remainder when s'^ is divided by N. Bob is confident that because he has kept the decryption exponent d secret, 
he and only he possesses the secret key that enables deciphering of S. 

At this juncture it is instructive to illustrate the encryption and decryption of messages in this RSA public key 
system of Bob's, in particular the dread message to Bob from Alice that "The FBI came" . To do so, it first is necessary 
to specify the aforementioned symmetric key. A possible easily usable symmetric key, which Singh suggests^^, requires 
Alice to replace each letter of the alphabet by its ASCII equivalent. ASCII^"^, the acronym for the American Standard 
Code for Information Interchange, is the protocol that converts computer keyboard strokes into the seven-bit electrical 
impulses that transmit our email; each such impulse is the binary representation of a positive integer (less than 
2'' = 128, of course). In ASClP^ the 26 upper case letters A through Z are represented by the integers (in base 10) 
65 through 90; the corresponding lower case letters are represented by the integers 97 through 122; a space between 
words is represented by the integer 64; the ASCII integer representations of other communication symbols, e.g., the 
period or the comma, are irrelevant to my present purpose. Similarly, for my present purpose it is suSicient, as well 
as simpler than using ASCII, to ignore the distinction between the upper and lower cases, and to represent the letters 
A through Z by the integers 2 through 27 respectively, saving the integer 1 for the space between words. 

I therefore take Alice's C, corresponding to her above-quoted dread message, to be: 21, 9, 6, 1, 7, 3, 10, 1, 4, 2, 
14, 6. These integers are written in base 10 of course, and unless otherwise noted I shall continue to write integers in 
the familiar base 10 throughout the remainder of this paper; the fact that Alice (with Bob's blessing) may choose to 
write her integers in another base, or may transmit lica' S to Bob via email (wherein the digits through 9 she uses to 
write her base 10 integers will be converted into electrical impulses which are the ASCII seven-bit binary equivalents 
of the base 10 integers 48 through 57 respectively^^), in no way affects the validity of any conclusions drawn below. 
This point understood, it next is necessary to choose the pair p and q of primes whose product yields the our 
hypothetical Bob had broadcast for Alice's use. I will choose the pair 5 and 11, which makes = 55, a conveniently 
small number for my present illustrative purpose but still large enough to satisfy the requirement that N exceeds 
every c in C. The quantity (p = 4x10 = 40; so I now can and do choose e = 23, consistent with the requirement that 
e be coprime to (f). To complete Bob's public key we need a positive integer d such that when de is divided by (p the 
remainder is unity. The integer d = 7 fits the bill, as the reader instantly can verify; the convenient method which 
Bob can use to find d in actual RSA practice is presented luider Subheading IV. D. 4. 

I now finally am in position to illustrate how Alice constructs her cryptogram S from C. In order to do so efficiently, 
however, it is desirable to introduce the modular arithmetic notation employed in number theory. 
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B. Modular Arithmetic Formulation of the RSA Public Key System. 

If the positive integer b is the remainder when the positive integer a is divided by the positive integer m, then a — b 

is exactly divisible by m. In number theory, if a difference of two integers a and b (each of which now may be positive 
or negative) is exactly divisible by the positive integer m, then we say " a and b are congruent modulo m" , and write^^ 

a = b (mod m). (1) 

Thus the above-specified procedure for obtaining the elements s of S from the elements c of C can be written as 

c'' = s (mod N). (2) 

Similarly Bob's secret key procedure for deciphering S can be written as c = u, where 

s'^ = u (mod N). (3) 

Finally, the formula yielding d from e is 

de = 1 (mod 0). (4) 

Since Eq. (1) means there is no remainder when a — b is divided by m, Eq. (1) can be restated as 

a — 6=0 (mod rn). (5) 

But if a — 6 is exactly divisible by to, then so is a — 6 — m, i.e., if Eq. (1) holds then it also is true that 

a = b + m (mod to). (6) 

The pair of Eqs. (1) and (6) imply that Eq. (2), though perfectly correct, does not uniquely determine s without 
the additional condition that s is a positive integer <N, which then guarantees that s indeed is the integer remainder 
when is divided by N. Similarly Eq. (3) does not uniquely specify u without the additional condition that u is a 
positive integer <N. 

The use of congruences eases Alice's task of constructing her cryptogram S from C via Eq. (2), as Subsection IV. A 
illustrates; in particular, for the key number A'' = 55, encryption exponent e =23, and the C given in the next to last 
paragraph of Subsection II. A, Alice's S turns out to be: 21, 14, 51, 1, 13, 27, 10, 1, 9, 8, 49, 51. Using the decryption 
exponent d = 7 in Eq. (3), Bob then readily decrypts this S into precisely Alice's original C, as Subsection IV. A 
also illustrates. The proof that the RSA system really does enable Bob to correctly decipher every S Alice transmits, 
namely (remembering s and u are required to be <N) the proof of the magical fact that Eqs. (2)-(4) imply u = c 
when N = pq and (p = (p - l){q - 1), is presented in Subsection IV.C. 



C. Cryptanalysis of RSA System Messages. 



The just displayed illustrative S was obtained from our illustrative C by successively inserting each individual c 
into Eq. (2), i.e. (recalling how Alice constructed her illustrative C), by enciphering Alice's original plaintext one 
letter at a time. But insertion of the same c into Eq. (2) always yields the same s; for example since both the third 
and last numbers in our illustrative C (recall Subsection II. A) are 6, both the third and last numbers in our illustrative 
S turn out to be 51. In other words our illustrative 5* is identical to the S into which Alice's original plaintext would 
have been enciphered using an appropriate unique substitution key of the sort described at the outset of Section II. Of 
course with actual RSA key numbers N the actually encountered s in S typically will be very large numbers, not the 
two digit numbers <55 of our illustrative S. Nevertheless it now is apparent that, despite the RSA system's number 
theoretic sophistication, if Alice continues to routinely encipher [via Eq. (2)] her plaintext into individual elements 
s one letter at a time, then the various S she transmits to Bob will be readily cryptanalysable, without any need to 
guess Bob's decryption exponent d or to employ Eq. (3) at all. More particularly, because Alice does not attempt 
to keep secret the messages S she sends to Bob, the relative frequencies and other characteristics of the various s 
in her messages, here assumedly written in English, are readily ascertainable; consequently the standard deciphering 
techniques^'^"^^'^^ applicable to symmetric unique substitution keys, referred to in the opening paragraphs of Section 
II, will be quite employable (e.g., the observation that a relatively infrequent Si almost invariably is followed by 
another S2 suggests si is q and S2 is u). 
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There is no requirement that Ahce obtain S by successively inserting each individual c into Eq. (2) however. For 
instance Alice simply and efficiently can obtain a very much more diHicult to cryptanalyse S' by first converting her 
C into a new C composed of integers c' constructed from large blocks of the original c entries; the only limit on 
the sizes of these blocks is that each c' must be less than the key number N. Obviously there can be essentially 
no available useful information, for the purposes of cryptanalysis, about the relative frequencies with which blocks 
of say forty letters occur in English, especially when during the enciphering those letters can be interrupted by 
punctuation marks, as well as by wholly superfluous digit combinations. Thus, returning now to our illustrative C, 
whenever Bob's key number N exceeds (lO)"*^" Alice could make her illustrative S very very much harder to decipher 
by inserting into Eq. (2) not the individual twelve c comprising the illustrative C, but rather the single 40-digit 
integer c' = 2143790906449201290703821001045802143806. This c' is composed of twenty pairs of digits wherein, 
reading from left to right, the pairs whose magnitudes are less than 28 comprise the original twelve c in their correct 
order in C, but wherein each pair exceeding 28 is superfluous and randomly chosen. Alice's insertion of this c' into 
Eq. (2) would yield a single s' comprising Alice's entire new sequence S' to be transmitted to Bob, which s' would 
have no discernible features related to the presence of those superfluities; yet Bob, after recapturing c' from this s' 
using Eq. (3), would instantly be able to recognize and discard the superfluous pairs, i.e., would have no difflculty 
whatsoever reconstructing Alice's original "THE FBI CAME" from his recaptured c'. Alice could similarly break up, 
into successively transmitted blocks of 40 digits, messages longer than our illustrative " THE FBI CAME" . 

This just described quite simple 40-block scheme is by no means the only conceivable means of replacing a letter by 
letter S by an S' whose elements s' bear no useful relationships (for the purposes of decipherment via the techniques 
discussed in the opening paragraphs of Section II) to the characteristics of the language in which the original plaintext 
message was written; moreover, as I will explain in a moment, modern RSA key numbers N permit blocks considerably 
larger than merely 40 decimal digits. In short, available perfectly feasible encryption systems make the likelihood 
that Alice's RSA-systcm messages could be deciphered via the aforementioned techniques, even after receipt by the 
would-be decipherer of many openly transmitted messages of hers, virtually zero. On the other hand those techniques, 
depending primarily on language properties, are not the only conceivable means whereby a third party might seek to 
decipher Alice's RSA messages to Bob. Descriptions of such deciphering schemes, which are varied, are beyond the 
scope of this paper. It is sufficient to state that no known means of deciphering RSA messages is computationally more 
practical than decipherment via factorization of the key number N. In particular a very thorough highly technical 
examination^^ of the security of the RSA system found (in 1996): "While it is widely believed that breaking the RSA 
encryption scheme is as difficult as factoring the key number TV, no such equivalence has been proven."; I am not 
aware of any later publications which contradict this finding. The relevance of being able to factor N is that once 
the primes p and q factoring N are known, the value of 4> immediately is yielded by the formula (j) = (p — l){q — 1) 
[recall Subsection II. A], with which formula, along with Eqs. (3) and (4), any codebreaker seeking to break Bob's 
RSA encryption system presumably will be acquainted; from and Bob's originally publicly broadcast encryption 
exponent e, one easily can determine Bob's originally secret decryption exponent d (as explained under Subheading 
IV.D.4), thereby enabling this codebreaker to decipher Alice's cryptogram (which she did not specially try to keep 
secret) via Eq. (3) no less readily than Bob himself can. 

D. Modern RSA Systems. Factoring N = pq With Classical Computers. 

It follows that (to the best present understanding) Bob can be confident in the security of his RSA system, provided 
there is no practical likelihood that a would-be codebreaker will be able to deduce the prime factors p and q oi N 
from the publicly known quantities N and e. In essence, therefore, the basis for Bob's confidence is the difficulty of 
factoring, with classical computers, numbers N of the astonishingly large magnitudes typically employed in modern 
RSA keys. According to recent Internet publications by RSA Security, Inc., the company founded by the inventors of 
the RSA system, key numbers N of 1024 binary digits now are both the recommended and popularly employed sizes 
for corporate use^^; 1024 binary digits corresponds to 309 decimal digits. 

The most obvious way to factor a large integer TV that is not a prime is to perform the sequence of divisions of 
N by the integers 2,3,... < a/TV until a factor of N is found. If A'^ has only two prime factors, each of the order of 
\/N. then approximately ^/N divisions will be required to find the factors of N. Thus, as Ekert^'' points out, this 
straightforward procedure cannot possibly be relied on to yield the prime factors of really large numbers N, numbers 
of 100 decimal digits say (which, though very large by any ordinary standards, are very very much smaller than the 
present RSA-recommended key numbers of 309 decimal digits). For a 100 decimal digit number N, i.e., for TV of the 
order of (10)^°", approximately (10)^° divisions may be required to ensure factoring by this straightforward procedure. 
Even if the average time for a single division is as small as (10)^"'^^ seconds, a very small time indeed even for today's 
fastest computers^^, the total time required to factor an iV ^ (10)^°" in this fashion well may be ^ (10)"^® seconds. 



5 



a duration very much longer than (6.3) (10)^^ seconds = 12 biUion years, the present best estimate of the age of the 

universe. 

Nevertheless in 1994 the 129 decimal digits public key number N known as RSA-129 was factored after only eight 
months of number crunching, thereby winning a symbolic $100 prize Martin Gardner had offered in 1977, shortly 
after RSA-129 was first made public. ^° This accomplishment was made possible by the ingenuity of mathematicians, 
who have been able to devise factoring procedures far more powerful than the brute force procedure described in 
the preceding paragraph; in particular RSA-129 was factored using the so-called quadratic sieve.^^ In 1999 RSA- 
155 (corresponding to a number of 512 binary digits) was factored after no more than seven months of computing 
time^^, using the even more powerful so-called number field sieve.'^^''^^ This factorization of RSA-155, in response 
to the Factoring Challenge^"^ started in 1991 by RSA Security, is^^ the primary reason that RSA Security increased 
its recommended key number N size from the previous RSA-155 to the present RSA-309; indeed RSA Security now 
recommends a key number size of 2048 binary digits (i.e., an RSA-617) "for extremely valuable keys" .^^ The wisdom of 
this recommendation is manifested by two very recent successful factorizations in response to the Factoring Challenge. 
Factorization of RSA-160 (corresponding to a number of 530 binary digits) was announced'^'"' on April 1, 2003; the 
announcement stated that RSA-160 was factored in less time than RSA-155, and made use of fewer computers in 
parallel. The announcement^^ that RSA- 174 (corresponding to a number of 576 binary digits) had been factored came 
on December 3, 2003, only eight months later; as of this writing the time and computer facilities needed to factor 
RSA-174 have not been released. 

That RSA-155 was factored with the expenditure of about the same amount of computing time as RSA-129 reflects 
not only the improved power of the number field sieve over the quadratic sieve, but also the fact that classical 
computers had greatly improved in speed during the mere five year interval from 1994 to 1999. This improvement 
is expected to continue, as comparison of the factorization times of RSA-155 and RSA-160 exemplifies. Thus it is 
estimated that by 2009 a computer costing no more than $10 million will be able to factor RSA-309 in less than 
one month''^; correspondingly it is anticipated that by 2010 the standard (not merely "extremely valuable") RSA key 
number size will be 2048 binary digits'^^, and by 2030 will be 3072 binary digits^^ (corresponding to RSA-925). Inherent 
in these anticipations is the well founded belief, thoroughly supported by experience, that if classical computing is 
all that's available then RSA public key systems can be kept secure via increases of key number size no matter how 
much classical computers improve, because the magnitude of the computing effort needed to factor a large number N 
increases so very rapidly with increasing N. 

To be precise, analysis of the number field sieve (presently the most efficient general-purpose factoring technique^^ 
for numbers N of modern key number size) leads to the conclusion that the number v of bit operations required to 
factor a large key number N with a classical computer is expected to increase with N no less rapidly than^*'^^ 

u{N) = exp[(1.90)(ln7V)i/3(inln7V)2/3] ^ exp[(1.32)Li/3(log2 L)"/^] (7) 

where: exp(x) denotes the exponential function e^; In denotes loge; L = log2 A'^, here and throughout this paper; 
and a bit operation^^ denotes an elemental computer operation (e.g., the addition of two bits, each either or 1). 
The growth of the right side of Eq. (7) as a function of L is termed^° subexponential, i.e., more rapidly than any 
power of L, but less rapidly than exp(L). For a specified computer, i.e., for some specified number of processors of 
specified speeds, the time t{N) to perform the factoring should be proportional to i^{N). Thus Eq. (7) predicts that 
the aforementioned hypothesized $10 million computer, which in 2009 will be able to factor RSA-309 in less than 
a month (two weeks say), still will require about sixty million years to factor RSA-617. We add that even for very 
large TV the computing effort required to find a pair of primes p and q of magnitude ^ ^/N is surprisingly small^-'^, 
so that the ability to keep ahead of classical computer factorization abilities via steadily increasing key number N 
sizes is not limited by any impracticality in finding key numbers A'" = pq. Similarly, although it may be thought 
the increasing encryption and decryption times inevitably associated with increasing key numbers N ultimately will 
provide a practical upper bound on the size of usable N, as of the foreseeable future any such bound, though it may 
exist in principle, will be utterly inconsequential.''^ 

In sum, Bob's confidence in the present and future security of the RSA systems he now employs and will employ 
appears to be justified if classical computing is all that's available. On the other hand, his confidence in the continued 
security of his present and future RSA systems would not be well founded if quantum computers able to employ Shor's 
algorithm could be constructed, as this paper now goes on to demonstrate. 

III. FACTORING USING SHOR'S ALGORITHM. 

Shor's algorithm, which is designed to take advantage of the inherent potentialities of quantum (as opposed to 
classical) computers, exploits a factorization method which differs from the sieves, discussed in Subsection II.D, that 
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presently are employed for large key number factorization. The presentation of Shor's algorithm begins, in Subsection 
III. A immediately below, with an explanation of this factorization method. The relevant (for a comprehension of 
Shor's algorithm) properties of quantum computers are summarized in Subsection III.B; Subsection III.C, relying on 
Subsection III.B, then carefully describes and illustrates Shor's algorithm; some concluding remarks pertinent to the 
algorithm are offered in Subsection III.D. A number of additional remarks about Shor's algorithm are relegated to a 
concluding Subsection III.D. Subsections III. A through III.D, taken together, help clarify the precise import of the 
assertion that factoring increasingly large key numbers N = pq ultimately should require less computational effort, 
i.e., ultimately should be more feasible, with quantum computers employing Shor's algorithm than with classical 
computers. 



A. Factoring N = pq Using the Order Property of Integers Coprime to N. 

Let n denote, here and henceforth unless otherwise stated, a positive integer coprime to N = pq (meaning, as 
explained at the outset of Subsection II. A, that n and N have no common prime factors other than 1), where p and q 
are two distinct large primes. For any such n, let fj, j = 1, 2, 3, be the remainder when is divided by N. Then, 
as with Eqs. (2) and (3), fj is uniquely specified by the equation 

= fj (mod pq) (8) 

together with the condition <fj <N. As explained in Subsection IV. B, for every n 

„0 = = 1 (mod pq), (9) 

implying = 1 for every n. For any given n, however, there well may exist other integers 1 < j < (j) = (p — l)(q ^ 1) 
for which fj = 1. The smallest such j, to be denoted by r hereinafter, is termed^"^ the order of n modulo pq. Thus, 
recalling Eqs. (1) and (5), Eq. (8) for j = r can be restated as 

n'^ - 1 = (mod pq). (10) 

Suppose now the order r of some integer n <N and coprime to is known (how r actually is determined is discussed 
below). Suppose further that r is even, necessary in order that n''/^ be an integer and thus meaningfully employable 
in congruences. Then Eq. (10) implies 

(n''/2 - l)(n''/2 + 1) = (mod pq). (11) 

Because by definition r is the smallest power of n for which Eq. (10) holds, the factor [n^/^ — 1) on the left side of 
Eq. (11) cannot be exactly divided by pq, i.e., 

n'^/^ - 1 ^0 (mod pq) (12) 

The second factor on the left side of Eq. (11) is not subject to any such restriction, i.e., it is possible that 

n'^/^ + 1 = (mod pq). (13) 

It is not necessary that Eq. (13) be true, however, i.e., it is perfectly possible that 

If both Eqs. (12) and (14) hold, we have the situation that the product on the left side of Eq. (11) is exactly divisible 
by pq although neither factor in this product is exactly divisible by pq. It follows that, to avoid contradiction, one 
of the factors in the product on the left side of Eq. (11) [the factor (n''/^ — 1) say], must be divisible by p but not 
by q, while the other factor [now (n''/^ + 1)] is divisible by q but not by p. When the order r of n modulo A'' is 
even, therefore, and Eqs. (12) and (14) both hold, Bob's proclaimed key number N = pq is immediately factored 
by computing the following two greatest common divisors (gcd's): of TV with (n^^'^ + 1), and of A^ with (n' /^ — 1); 
alternatively one can factor A^ by first computing q say as the gcd of A^ with (n*"/^ + 1), and then determining p via 
division of N by this q. The convenient Euclidean algorithm for finding the gcd of two integers is described under 
Subheading IV.D.l. 

The probability that a randomly selected n < N = pq and coprime to A^ will have an even order r satisfying Eq. 
(14) is^^ approximately 1/2. Moreover, as also is explained under Subheading IV.D.l, calculating the gcd of a pair 
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of numbers using classical computers, even of a pair of large numbers like those which arc likely to be encountered 
in the factorization of an RSA-309 or RSA-617, is a straightforward procedure requiring negligible computing time 
(in comparison with the factorization times quoted in Subsection II. D). Therefore the feasibility of factoring a large 
N = pq via the procedure described in the preceding paragraph depends primarily on the feasibility of determining 
the order r of n modulo N for arbitrarily selected n. With classical computers this determination in essence requires 
solving the so-called discrete log prohlem!^^ ; experience has shown, however, as Subsection II. D clearly implies and as 
Odlyzko'*^ explicitly concludes, that classical factoring of large N = pq via solutions to the discrete log problem is not 
more feasible than factoring N using the sieves discussed in Subsection II. D. 

With quantum computers on the other hand, determining r, and thereby factoring N, becomes feasible using the 
periodicity property of the sequence fj, j = 1, 2, 3, defined via Eq. (8). Namely it is proved in Subsection IV. A 
that for any n all the integers /i,/2, •••,/r-i,/r = 1 are different, but that for each j in the range 1 < j < r and 
every positive integer k we have fj = fj+r = fj+2r = ■■■ fj+kr = fj+{k+i)r = •••• In other words the sequence fj, 
j = 1,2,3,..., is periodic with period r. For example, returning now to our illustrative = 55 key number, for 
n = 16 the order r = 5 and the sequence fj is (starting with j = 1) 16,36,26,31,1,16,36,26,31,1,16,36,..., all as is 
readily verified via the congruence manipulations discussed and illustrated in Subsection IV. A; similarly for n = 12 
the order r is 4 and the sequence fj is 12,34,23,1,12,34,23,1,12,34,.... Shor's algorithm, taking advantage of the fact 
that a quantum computer is described by a wave function (as elaborated in Subsection III.B), i.e., has wave-like 
properties, employs the quantum computer analog of Fourier transformation to extract the order r from a quantum 
computer wave function which has been specially constructed to exhibit this r periodicity for some randomly selected 
n. Moreover and most importantly, as also is elaborated below, the computational effort required to determine r using 
Shor's algorithm increases with A^ no more rapidly than some power of A^, i.e. [recall Eq. (7)], increases much more 
slowly with N than does the effort required to factor N using a classical computer. Classical computer factoring via 
solution of the discrete log problem does not result in a slower increase of iy{N) with A^ than Eq. (7) manifests because, 
inter alia, with such computers (perhaps since the terms wave function and wave-like properties are meaningless for 
them), the number of bit operations required to calculate a Fourier transform is*^ proportional to NL = L2^, i.e., 
increases with N even more rapidly than does the right side of Eq. (7). 

Before proceeding to the summary, in Subsection III.B, of the relevant properties of quantum computers, I emphasize 
(as was foreshadowed in the closing paragraph of Section I) that once a suitable r has been determined using Shor's 
algorithm the factorization of N using Eqs. (12) and (14) can be routinely performed on a classical computer, making 
use of the Euclidean algorithm described under Subheading IV.D.l. Referring to our numerical illustrations in the 
preceding paragraph, for the choice n = 12 Shor's algorithm will yield r = 4. Then from the sequence fj for n = 12 
we need to insert fr/2 = /2 = 34 [which is congruent to (12)^ modulo 55] into Eq. (11). Since Eqs. (12) and (14) 
both are satisfied for this /2, we immediately know [as explained beneath Eq. (14)], that: (i) /2 + 1 = 35 must be 
divisible by one of the factors of 55 (in this case 5, as we would determine by computing the gcd of 35 and 55), and 
that (ii) /2 — 1 = 33 must be divisible by the other factor of 55 (in this case 11), as we would determine either by 
computing the gcd of 33 and 55, or (more simply) by direct division of 55 by its already determined factor 5. 



B. Quantum Computers. 

I assume the readers of this paper have an understanding of quantum mechanics at the level of standard texts'** 
for introductory quantum mechanics courses. For such readers, both Mermin* and Grover^ intelligibly explain how 
quantum computers differ from classical computers. I proceed to very briefly summarize the information about 
quantum computers needed to comprehend the functioning of Shor's algorithm, beginning with a quote from Grover^: 
"Just as classical computing systems are synthesized out of two-state systems called bits, quantum computing systems 
are synthesized out of two-state systems called qubits. The difference is that a bit can be in only one of the two states 
at a time; on the other hand a qubit can be in both states at the same time." To elaborate, any measurement of the 
state of a qubit, like any measurement of the state of a classical bit, can yield only one or the other of two and only 
two possible states. Because a qubit is a quantum mechanical system describable by a wave function, however, the two 
exclusive possible outcome states for a state measurement performed on a qubit typically will depend on measurement 
details, as of course is not the case for a classical bit. Suppose for instance that our qubit is a spin 1/2 particle, one 
of many conceivable actual physical realizations of a qubit in a practical quantum computer. ^'^ Then a measurement 
of the component of the particle's spin along the z direction, via a Stern-Gerlach apparatus say, can yield the results 
-1-1/2 and -1/2 only, to which correspond respectively orthogonal wave functions which can be denoted by | + 2;) and 
I — z) (using Dirac notation). Similarly, if a measurement of the component of spin along the y direction is performed 
on a particle which has been found to have spin +1/2 along the z direction, the only possible results again are -1-1/2 
and -1/2 only, to which correspond respectively orthogonal wave functions which can be denoted by \+y) and \ — y)- 
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But neither of the wave functions | + y) and | — y) is identical with the wave function | + z) (or with the wave function 
I — z) for that matter) . Rather each of the wave functions | + z) and | — 2;) is a known hnear combination of the wave 
functions | + y) and | — y) and vice versa, as readily can be worked out from the established theory of the behavior 
of spin 1/2 wave functions under coordinate axis rotations. 



1. The Computational Basis. Quantum Computer Wave Functions. 

Therefore, in order to enable convenient employment of a qubit for computational purposes, namely in order that the 
two possible outcomes of state measurements on the qubit be consistently interpretable as corresponding to the binary 
integers and 1 respectively (as are interpretable the outcomes of state measurements on a classical bit), it is necessary 
to suppose that the qubit state measurement always will be performed in the same way, e.g., with a Stern-Gerlach 
apparatus always lined up along the positive z direction if the qubit is a spin 1/2 particle. On this supposition the pair 
of orthogonal wave functions describing the two possible qubit state measurement outcomes customarily are denoted 
by |0) and |1); these singled-out wave hmctions comprise the so-called computational basis, and are interpretable 
respectively as corresponding to the binary integers and 1. Thereupon the wave function \1/ describing any arbitrary 
state of the qubit, which always is a linear superposition of any pair of orthogonal wave functions, typically is expanded 
in terms of |0) and |1) only. In other words, for convenient analysis of the computational functioning of the qubit, 
one typically writes 

^ = n\0)+u\l), (15) 

where /U and u are a pair of complex numbers satisfying 

\nf + \uf = l. (16) 

Eq. (16), which expresses the fact that ^ can be and is normalized to unity (as are |0), |1) and all wave functions 
discussed below), permits the interpretation that [when the qubit wave fimction is given by Eq. (15)] is the 
probability that a state measurement will yield the outcome corresponding to 0, while is the probability that the 
same measurement will yield the outcome corresponding to 1. 

A quantum computer is a collection of qubits, and as such also is a quantum mechanical system whose state must 
be describable by a normalized wave function. Consider, in particular, a computer composed of just two qubits, 
labeled respectively by A and B. There now are at most 2x2 = 4 possible different outcomes of state measurements 
on the pair of qubits A, B (whether performed simultaneously or successively). Consequently the wave function of 
this computer now must be a linear superposition of at most four orthogonal two-qubit basis wave functions, which 
(as Mermin^ Mly discusses) can be taken to be the computational basis products |0)b|0)a = |00), |0)b|1)a = jOl), 
|1)b|0)a = |10), and |1)b|1)a = |11)- In other words the most general two-qubit computer wave function has the form 

* = 7oo|00) + 7oi|01) + 7io|10) + 7ii|ll), (17) 

where in the computational basis wave functions |00), etc., it is understood that the two binary digits reading from left 
to right correspond to the outcomes of state measurements on qubits B and A respectively, and where the associated 
amplitudes 700, etc., are complex numbers satisfying 

l7oo|' + l7oi|' + l7io|' + |7ii|' = l- (18) 

The digit pairs 00, 01, 10 and 11 indexing the computational basis wave functions appearing in Eq. (17) are the 
binary system representations of the integers through 3 (now written in the decimal system), with the proviso that 
in the binary system each of these integers is to be represented by no fewer than two digits. Thus Eq. (17) can be 
rewritten as 

*=7o|0)+7i|l)+72|2)+73|3) (19) 

where the basis wave functions \i) and associated amplitudes 7;, ?' = 1 to 3, obviously are merely relabelings, respec- 
tively, of the basis wave functions |00), |01), etc., and of the amplitudes 7oO)7oi) etc. 
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2. Wave function Collapse. The Bom Rule. 



In Eqs. (17) and (18) each |7/3qP is the probability that measurements on the qubit pair A,B in the two-qubit 
state described by 5* will yield state \a)A for qubit A and state \(3)b for qubit B, where each of a and (3 can have 
the values and 1 only, all as developed immediately above. It is conceivable, however, that the observer will seek to 
measure the state of qubit A only, without any attempt to ascertain the state of qubit B. In this event |7ooP + |7io|^ 
is the probability of finding A in state \0)a, while I701P + I711P is the probability of finding A in state |1)a; we see 
that these probabilities sum to unity, as they must. If a measurement on qubit A is performed, and A actually is 
foimd to be in state |0)^, then the original wave function ^E" of Eq. (17) is said to have been reduced or collapsed by 
the measurement into the new wavefunction = '^b\0)a, wherein the one-qubit wave function for qubit B is 

*s = [|7oo|' + |7io|']"'/^[7oo|0)b + 7io|1)b]. (20) 

Eq. (20) is in accordance with the Born rule, which Mermin^ fully discusses; the normalizing factor [|7ooP + |7ioP]~'^^^ 
in Eq. (20) is needed to ensure that ^' and '^b are normalized wave functions, i.e., that in and in the state of B 
described by the one-qubit wave function "^b, the individual probabilities of finding qubit B in state |0)b and in state 
|1) B sum to unity, as again they must. Note that the square of the coefficient of |/3)b in Eq. (20), which represents 
the probability of finding qubit B in the state \P)b knowing that a measurement on qubit A in the two-qubit state 
described by ^ of Eq. (17) already has yielded |0)a, differs from I7/30P representing the probability, without any such 
knowledge, that measurements on the qubit pair A, B in the two-qubit state described by ^' will find qubit A in state 
|0)^ and qubit B in state |/3)b- The modification of Eq. (20) appropriate to the circumstance that A actually had 
been found in state \1)a rather than in state \0)a is obvious. Equally obviously [starting again with the two-qubit 
system in the state described by ^1/ of Eq. (17)]: (i) the probability of finding qubit B in state |/3)b (here /? is either 
or 1) without any attempt to ascertain the state of qubit A is |7/3aP) with the sum of these probabilities = 
S/3 l7/3aP = ^^'^ (ii) ^ actually is found in state |/3)s (/? again either or 1), the original ^ collapses into 
the wave function ^a\P)b, where 

*A = [|7/3o|' + |7/3i|']"'/'[7/3o|0)a + 7/3i|1)a]. (21) 



The considerations in the two preceding paragraphs are immediately extensible to larger quantum computers, 
composed of g > 2 qubits. Since a state measurement on any given qubit can have at most two different outcomes 
(which we have designated by the binary system digits and 1), state measurements on the entire collection of qubits 
comprising a g-qubit quantum computer can have at most 2^ different outcomes. Correspondingly the wave function 
^' describing any state of a g -qubit quantum computer is a linear superposition of at most 2^ orthogonal 9-qubit 
basis wave functions. Index these g qubits by k running from 1 to g. Then the 2^ computational basis wave functions 
for the computer can be taken to be 

|0)g|0)g-i...|0)2|0)i ^ I00...00), |0)g|0),^i...|0)2|l)i ^ I00...01), ...|l)g|0)g-i...|0)2|l)i ^ I10...01), (22) 

etc., and [analoguously to Eq. (19)] the most general g-qubit quantum computer wave function can be expressed as 



i=0 

with of course 



E l^^l' = 1- (24) 



i=0 



In Eqs. (23) and (24): the integers i are conveniently written in the decimal system, as in Eq. (19); the binary system 
representation of each i consists of no fewer than g digits; each computational basis wave function \i) represents 
a g-qubit state wherein (for every k, with 1 < k < g) the outcome (either or 1) of a state measurement on 
the kth qubit surely equals the ki/i digit (reading from right to left) in the binary system representation of i; and 
|7;p is the probability that when the computer is in the state described by the wave function 5* of Eq. (23), state 
measurements on the collection of g qubits will have the characteristic outcomes (specified earlier in this sentence) of 
state measurements when the computer wave function is simply \i). Moreover if, while the computer is in the state 
described by ^",0/ Eq. (23), the computer operator were to measure, e.g., the states of qubits 1, 2 and g, and were 
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to obtain the outcomes |l)i, |0)2 and \l)g respectively, these measurements will collapse ^ into the wave function 
4'g_3[|l)i|0)2|l)g], where: (i) the (^-^j-qubit wave fimction 

*.-3 = [El^^f]"'^'E^ilj') (25) 

i i 

describes the state of the remaining qubits 3,4,. ..( g-1) knowing that state measurements on qubits 1, 2 and g in the 
g-qubit system described by ^ had yielded the outcomes |l)i, |0)2 and respectively; and (ii) in Eq. (25) j runs 
over all integers whose g-digit binary representations begin with 1 and end with 01 (now reading from left to right). 



3. Operations on Quantum Computers. Unitarity. 

It can be assumed that the gt-qubit quantum computer wave function \I' of Eq. (23), like the wave function of any 
other quantum mechanical system, evolves in time in accordance with the non-relativistic time-dependent Schroedinger 
equation 

where h is Planck's constant and the Hamiltonian H, which may be time dependent, is some appropriate Hermitian 
operator capable of meaningfully acting on the various computational basis wave functions \i) appearing in Eq. (23). 
In this circumstance the wave function \l/(t) at any time t > is related to the wave function ^"(0) at t = by 

*(t) = U^'(O), (27) 

where, because H is Hermitian, U = U{t) is a linear normalization-conserving operator^^, i.e., a unitary operator. 
Whatever the physical realizations of the individual qubits comprising the quantum computer may be, the computer's 
utility as a computational tool depends on the ability (of the person performing the computation) to control the 
time evolution of its wave function. But this desired controlled evolution, which generally requires modifying the 
environments of the individual qubits (e.g., when the qubits are spin 1/2 particles, rotating the individual magnetic 
fields acting on those particles), still necessarily is a time evolution of under Eq. (26). Thus the desired controlled 
evolution also is described by Eq. (27), i.e., necessarily involves a unitary operation on the initial wave function ^1/(0), 
here the computer wave function when the modifications of the individual qubit environments were initiated. 

Accordingly each planned operation in the sequence of operations constituting any proposed quantum computing 
algorithm, e.g., Shor's algorithm, must be a unitary operation. Postulated non- unitary operations on a quantum com- 
puter, no matter how attractive seemingly, are irrelevant and thus of no interest whatsoever to the use of the computer 
as a computational tool, because no non-unitary operation will be attainable with any actual physical realizations 
of the qubits comprising the computer. Therefore it is important to note (as we amplify in various Subheadings 
under Subsection III.C below) that each of the quantum computing operations involved in Shor's algorithm indeed is 
unitary. Of course the impossibility of constructing a physical realization of any non-unitary operation does not imply 
every conceivable unitary operation on a quantum computer can be physically realized; furthermore if the computer 
is composed of a large number of qubits, e.g., thousands of qubits (as is quite likely in actual practical applications 
of Shor's algorithm, see under Subheading III.C. 1 below), the prospect of actually constructing a physical realization 
of any non- trivial unitary operation U on so large a collection of qubits seems hopeless at first sight, even if there 
is reason to believe that a physical realization of U must exist. Fortunately, however, and absolutely crucial for the 
practical potential of quantum computation, it can be proved that every conceivable unitary operation on an arbitrar- 
ily large 5-qubit quantum computer, even an operation involving simultaneous modifications of the environments of 
all g >> 2 qubits, can be reproduced via an appropriate sequence of basic unitary one-qubit and two-qubit operations 
only.^^ Moreover, and also very worthy of note, numerous methods, based solely on known physics, for achieving 
physical realizations of these basic unitary one-qubit and two-qubit operations (also known as universal quantum 
gates^^) have been proposed"^^, although admittedly the actual implementation of many of these conceptual physical 
realizations well may prove to be diSicult in practice. 

For the purposes of this paper, therefore, it is not unreasonable to assume that quantum computers consisting 
of arbitrarily large assemblages of qubits, capable of performing any desired computational algorithm which can be 
formulated in terms of unitary operations, eventually will be constructed. Granted this assumption, a measure of 
the quantum computational effort required to perform any given algorithm, indeed the only obvious measure, is the 
number of universal quantum gates that must be strung together to perform the algorithm on a quantum computer; 
in essence the universal quantum gate operations play the role, for quantum computation, that the bit operations 
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referred to in connection with Eq. (7) play for classical computation. Thus wc now finally are able to make precise, 
as we very much need to do, the meaning of our oft-rcpeated assertion that the Shor algorithm enables a quantum 
computer to factor large key numbers N = pq with far less computational effort than using a classical computer 
requires. In particular, with a quantum computer using Shor's algorithm the number Vg of universal quantum gates 
required to determine an order r that will enable factorization of a large N = pq via Eq. (11) is estimated^'^^ to obey 
the equation 

iyg{N) = 0[(ln7V)2(lnln7V)(lnlnln7V)] = 0[L\log^ L){\og2log^ L)], (28) 

where L = log2 N as in Eq. (7) and the symbol O, denoting Order of, means^^ there exists a constant K such that 
for sufficiently large N 

Vg{N) < K[L\\og^ L)(log2 log2 L)]. (29) 

In connection with Eqs. (28) and (29) it is useful to recognize that for large A'^ the number of qubits required to 
represent N is essentially L. To be precise, for any real number x> 1, let [x] denote the largest integer < x\ then it 
is easily seen that the number of qubits needed to represent any N is [log2 N\ +1, which for large N differs negligibly 
from L. 

The immediately preceding discussion has overlooked the fact that in practice the actual factorization of N using 
Shor's algorithm requires computational operations (e.g., classical computer gcd calculations as discussed at the end of 
section III. A) beyond the universal quantum gate operations whose number is estimated in Eq. (28). Subsection III.D 
below implies, however, that for the purposes of this paper such neglected computational operations do not negate 
the validity of Eq. (29) as a measure of the computational effort required to factor a large N = pq with a quantum 
computer using Shor's algorithm. Therefore comparison of Eqs. (7) and (29) validly quantifies the reduction in the 
computational effort required to factor a large N that is achievable with a quantum computer. Whereas according 
to Eq. (7) the number of elemental computer operations needed to accomplish the factorization of TV with a classical 
computer increases faster than any power of L = log2 N, the needed number of elemental computer operations using 
a quantum computer increases only barely more rapidly than (indeed surely less rapidly than L^) according to Eq. 
(29). To illustrate the practical import of this reduction, let us repeat the numerical exercise presented immediately 
below Eq. (7), only this time for a quantum computer. For any specified sufficiently large quantum computer (i.e., for 
any quantum computer composed of sufficiently many qubits to handle the Shor algorithm determination of r for all 
relevant A'^, see under Subheading III.C.l), the time Tq{N) needed to complete the factorization of a large A^' should be 
approximately proportional to Vq{N) given by Eq. (29), irrespective of the value of K appropriate to that computer. 
Suppose wc were able to construct a quantum computer which, like the classical computer we previously hypothesized, 
could factor RSA-309 in two weeks time. Then this same quantum computer (again assumedly composed of sufficiently 
many qubits) should be able to factor RSA-617 in no more than about 9 weeks, in contrast to the 60 million years 
the classical computer which factored RSA-309 would require. 

Moreover it is not unreasonable to believe that a sufficiently large quantum computer, if such computers can be 
built at all, will be able to factor RSA-309 in two weeks. Two weeks is about 1.2x10^ seconds. For RSA-309, i.e., 
for L = 1024, the value of I'g from Eq. (29) is only 3.5x10^ even assuming K will be as large as 100, which seems 
doubtful. To factor RSA-309 in two weeks, therefore, the average time for performing a quantum gate operation need 
be no faster than about 300 microseconds, which should be no problem whatsoever for quantum computer elements, 
whether operating on atomic, molecular or photonic scales. In short, once sufficiently large quantum computers 
become available Bob no longer will be justifiably confident that, merely via increases of his proclaimed key number 
size, he can maintain the security of Alice's RSA-coded messages to him in the face of anticipated likely improvements 
in quantum computer capabilities. 

Before closing this discussion of operations on quantum computers it is important to note that wave function 
collapsing measurements on any part of a quantum computer, though normalization conserving by virtue of the 
Born rule^ referred to under Subheading III.B.2, are not-strictly speaking-quantum computing operations of the 
sort discussed in earlier paragraphs under the present Subheading. In particular, let '^m denote the wave function 
describing the state of the g — G remaining qubits after observing the outcomes of state measurements on any G-qubit 
subset (1< G < ff) of a (;-qubit quantum computer described by the wave function of Eq. (23); for instance 
might be the wave function \I>g_3[|l)i|0)2|l)g] introduced in connection with Eq. (25). Then, as Mermin^ discusses, 
because both and ^ are normalized wave functions expressible as linear superpositions of the very same set of 2^ 
orthogonal computational basis wave functions, there must exist a unitary operator Um such that 

*M = Um*. (30) 

Since \1/ can be thought of as \l/(0), the computer wave function at time t = Q when the measurement operation 
began, and can be thought of as ^{t), the computer wave function at time t > Q when the measurement has 
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been completed, Eq. (30) appears to be of precisely the same form as Eq. (27). The subtle difference is that whereas 
in Eq. (27) we have been considering unitary operators which are predictably controllable [i.e.. which during each 
step of the computational algorithm will cause the computer wave function ^'(0) to evolve into some desired ^'(t)], 
Um of Eq. (30) generally is not predictably controllable. Rather the Um we obtain as a result of the measurement 
generally is only one of many possible Um, whose likelihoods of turning up in the actual measurement operation 
we have performed depend on the values of the coefficients 7^ in Eq. (23); only after we observe the measurement 
outcome can we decide which of the many possible Um actually has been obtained. Correspondingly, there generally 
is no way, before the measurement operation, to introduce a sequence of universal quantum gates which will reproduce 
the unitary operator Um of Eq. (30) that actually is attained. 

C. The Operations Constituting Shor's Algorithm. 

Shor's original formulation^ of his algorithm has been given an admirably readable (by non-specialists) step-by-step 

prescription by Williams and Clearwater^^, which my presentation will closely follow, but also will expand on and 
illustrate. The text of each subheading in this Subsection very briefly describes one of those eight steps. 

1. Determine the Minimum Computer Size Required. Divide the Qubits into Two Registers. 

The reader is reminded of the notation and contents of Subsection III. A. Shor's algorithm seeks to accurately discern 
the periodicity with period r manifested by the sequence /j, j = 1,2,3, obtained from Eq. (8) for some chosen 
n. In order to do so. the algorithm must operate on sequences which are many many periods in length (see under 
Subheading III.C.7), much as in conventional classical Fourier transformation. In actual practice the order r well may 
attain its maximum possible value {p ~ 1)(<Z" l)/2, which for large N is likely to be only very slightly smaller than 
half the to-be- factored N = pq (see Subsection IV. B); for instance in the case of our illustrative = 55, the order r 
equals its maximum allowed value 20 for fully 16 of the 40 integers n < 55 that are coprime to 55, including n as small 
as 2 and as large as 53. Consequently accurate determination of r using Shor's algorithm generally requires the use 
of powers j >>N in Eq. (8). Shor'' recommends (in effect) that the maximum power j — jmax employed be no less 
than iV^, a recommendation this paper accepts; Williams and Clearwater''^ recommend jmax be even greater, namely 
at least 2A''^. Thus, following Shor, the quantum computer being employed to determine r via Shor's algorithm must 
contain at least enough qubits to represent powers j up to jmax = N'^. The minimum number of qubits needed to 
represent the integer N'^ is [log2 iV^] -|-1 [recall the text immediately following Eq. (29)]; accordingly, the quantum 
computer will be supposed to contain a set of y — [log2(iV^] -I- 1 qubits, which will be said to comprise register Y. In 
addition the computer must contain a second set of qubits, here termed register Z, capable of storing the computed 
values of fj, which can be as large as A?" — 1; the size of this register will be taken to be its minimum possible value 
z = [log2(A^ — 1)] +1 qubits. Note that because N is known to be the product of a pair of odd primes, i.e., surely is 
not a power of two, 2^''^ < N'^ < 2^ . 

With the large N of interest herein, the difference between [log2(A''^)] -|- 1 and 2L, like the difference between 
[log2(A'^ — 1)] and L or between L and L + 1, is negligible for the purpose of estimating the computational efforts 
required to accomplish the various individual steps constituting Shor's algorithm; L = log2 A^, as previously. Thus in 
any subsequent estimates herein of those efforts y justifiably will be replaced by 2L (ignoring the fact that 2L need not 
be an integer); this is the same replacement for y employed"'''* to obtain Eq. (28). For the purpose of such estimates 
the difference between [log2(^^)] = 2L and [[log2(2^2)] 5^ 2L + 1 also is negligible, meaning that the estimated 
computational efforts required to accomplish the various individual steps constituting Shor's algorithm actually do 
not depend significantly on whether we prefer the Shor or the Williams-Clearwater estimates of the required jmax- 
Furthermore we now can conclude that unless for large N the actually required value of jmax differs from Shor's 
recommended value by quantities exceeding O(N^) [where O is defined as in Eq. (28)], determining r and thereby 
factoring a large A^ will require a quantum computer not less than about 3L qubits in size. In other words, factoring 
a key number of the presently recommended size RSA-309 corresponding to 1024 binary digits (recall Subsection 
II. D) seemingly would require a quantum computer of at least 3072 qubits in size; factoring RSA-617 would require 
a quantum computer of more than 6,000 qubits in size. 
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2. Load the First Register With the Integers Less Than or Equal to 2'" — 1. 



After ordering and indexing the y qubits in register Y as discussed in connection with Eqs. (22)-(25), the complete 

set of computational basis wave functions for those qubits can be written as where: the subscript indicates 

that we are writing wave functions for register Y] j \s an integer, < j < 2^ — 1, which shall be written in decimal 
notation; when register Y is in the state described by the basis wave function the binary digit representation of j 
immediately reveals the one-qubit basis state, |0)fe or of each of the qubits in register F; and it is understood that 
qubit k {1 < k < y), whose basis states arc identified by the subscript k, corresponds to the A;th digit, reading from 
right to left, in the binary system representation of j [as discussed immediately beneath Eq. (24)]. The computational 
basis wave functions for register Z similarly are denoted by \i)z, where < i < 2^ — 1. It is postulated that initially 
every one of the y + z qubits constituting the quantum computer can be set into its own one-qubit |0) basis state, i.e., 

that the initial wave function of the entire quantum computer is "^q^ = \0)y\0)z', here and hereinafter, the subscript 
C denotes the wave fimction of the entire computer. Proceeding with the algorithm requires transforming the initial 
register Y wave function = |0)y to its first stage form 

2»-l 

*f = 2-^/^5] I j>, (31) 
j=o 

i.e., requires replacing the initial \0)y [wherein a measurement of the state of register Y can only yield the result 0] 
by the sum on the right side of Eq. (31) [wherein, recalling the discussion of Eqs. (23)-(24), a measurement of the 
state of register Y has an equal chance of yielding any of the integers between and 2^ — 1 inclusive]. There are 2^ 
independent \j)Y on the right side of Eq. (31); thus the factor 2^^/^ guarantees "Py^ is normalized.. Since we know 
< 2^ — 1, the sum in Eq. (31) includes every j less than or equal to Shor's recommended jmax = 
The transformation of \0)y to of Eq. (31) is accomplished via use of the one-qubit operation Uh known as a 
Hadam,ard transformation, which is defined^'^ so that the results of the Hadamard operation on the one-qubit basis 
state wave functions |0) and |1) are 

Uh|0) = -^(|0) + 11)), Uh|1) = ^(|0) - 11)). (32) 

Uh is known^^ to be unitary, as is trivially verifiable; the factor l/-\/2 in Eq. (32) enables \5h to preserve normal- 
ization, as we know a unitary operation must.^^ Denote the operation that performs the Hadamard on qubit k alone, 
without affecting any other qubits, by Uuk- Next consider the result of operating with \J Hk on a computational 
basis wave function for which the kth digit (reading from right to left) in the binary expansion of j is zero (not 
1), meaning [recall Eqs. (22)-(25)] that the product of one qubit basis wave functions constituting \j)Y includes the 
factor |0)fc {not \l)k)- To obtain this desired result, we need only put the subscript k on every one of the basis wave 
functions in Eq. (32); moreover since our present \ j)Y contains no \ l)k basis state, we are here concerned only with the 
first equality in Eq. (32). It follows that, except for the factor l/-\/2, the operation \5Hk on our present \j)Y merely 
replaces [0)^ in by [0)^ -|- |l)fc while leaving otherwise unchanged. In the binary expansion of the integer j, 
however, changing the A;th digit from to 1 (always reading from right to left) produces the binary expansion of the 
integer j + 2*^"^. Therefore when \j)Y contains no |l)fe basis state, 

\JHk\3)Y = ^{|j> + \j + 2'-'}y}. (33) 

Now perform the y operations U^i, Uff2, U^s, ...,\JHy, sequentially (first \Jhi, second 'Uh2, etc., ) on the initial 
register Y wave function \0)y = "^y^ ■ We know \0)y contains no |l)i factors for any i, I < i < y. Thus we surely can 
employ Eq. (33) for the first of these operations to obtain 



= U^,i|0) = i={|0)Y + |0 + 2°)y} = -^{|0)y + \1)y} = ^ E IJ>- (34) 

Because U//ihas been defined so that it performs the Hadamard operation on qubit 1 only, the wave function 

(like "^Y^) does not contain the factor [1)2, as is directly evidenced by the fact that both the integers and 1 on the 
right side of Eq. (34) are less than 2. Consequently we also can employ Eq. (33) for the second of these sequential 
operations, thereby finding for 'Uh2Uhi\0) = Uh^'^y^ = , 
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(2) _ 



1 



U^2{|0)y + \1)y} = ^{mr + 10 + 2)y] + [\1)y + |1 + 2)y]} = - ^ b>. 



(35) 



Y — 



V2 



3=0 



Because every one of the integers on the right side of Eq. (35) is less than 4 = 2^, surely does not contain the 

factor 11)3, permitting use of Eq. (33) to evaluate \Jh3^y^ ■ Proceeding in this fashion, it is readily seen that the 
result of the full sequence of Hadamard operations on \Q)y is 



The right side of Eq. (36) is the desired of Eq. (31). It is generally agreed^^ that the above-defined Hadamard 

one-qubit operations are universal quantum gates, as defined under Subheading III.B.3. Accordingly accomplishing 
this first stage transformation of the initial |0)i' to requires no more than y = 2L universal quantum gates. 



After this just completed first stage, the wave function of the entire computer is = '^}^\Q)z, meaning that 
after competion of the first stage a state measurement on register Z still is guaranteed to yield the integer only, 
irrespective of what j may be yielded by a simultaneous state measurement on register Y. For the next step an n 
coprime to N is required. As Subsection IV. B explains, such an n can be obtained, with probability essentially 
indistinguishable from unity, simply by choosing an arbitrary integer i, either in the range 1 <i < A'' or in the range 
1 <z < 2^ — 1. Of course whether any selected integer i actually is coprime to A^ readily can be tested by calculating 
the gcd of i and N using a classical computer (as discussed under Subheading IV.D.l), but the probability such a 
randomly chosen i will not be coprime to N is so small the effort of computing this gcd does not seem worthwhile. If 
the selected i is not coprime to A^, that fact will become apparent when it is realized the value of the supposed order 
r, inferred as in step 7 below, does not satisfy Eq. (10); Subsection IV. B explains that no integer r can satisfy Eq. 
(10) when the integer n in Eq. (10) actually is not coprime to N. In this event it merely will be necessary to repeat 
steps 2 through 7 after choosing a different i, which new i essentially certainly will be coprime to A"; such repetitions 
often are required even when the chosen i indeed is coprime to A^ (see under Subheading III.C.8 below). 

Assuming now an i = n coprime to N actually has been selected, the next step of the algorithm transforms to 
its second stage form 



where fj is defined by Eq. (8). With the computer wave function ^'^f of Eq. (37), the result of a state measurement 
on the c:ollec:tion of qubits in register Z must yield one of the remainder integers 1< fj < A — 1 prescribed by Eq. 
(8). Moreover, because of the periodicity of fj demonstrated in Subsection IV. A, every one of the fj in Eq. (37) 
must equal one of the (all necessarily different) /i, /2, fr = fo = 1- No other integers can result from a state 
measurement on register Z after completion of the second stage of the algorithm; in particular, since n is coprime 
to A^ by definition, such a measurement now cannot possibly yield the previously (at completion of the first stage) 
assured result 0. 

I shall not detail the operation which transforms ^'^^ to 4*^^ of Eq. (37). The operation is fully discussed by Shor^, 
who shows it is unitary. The number of universal quantum gates required to perform this unitary operation is^'^^ 
0[L^ (log2 i)(log2 log2 -L)]. Let us illustrate Eq. (37) when A^ = 55 and n =16. In these circumstances, as discussed 
in Subsection III. A, r = 5 and the sequence fj (now starting with j = 0) is 1,16,36,26,31,1,16,36,26,31,1,16,36,.... 
Accordingly in this illustrative case Eq. (37) is 



^2S ^ 2-y/^{\0)Y\l)z + |1)f|16)z + |2)y|36)z + |3)y|26)z + |4)y|31)z + |5)y|l)z + |6)y|16)z 
+ |7)y|36)z + \S)y\26)z + |9)y|31)z + |10)y|l)z + |ll)y|16)z + |12)y|36)z + ... + {2^ - l)y |/2«-i)z}. (38) 



As Eq. (38) illustrates, the sequence of register Z basis wave functions \l)z,\fi)z,\f2)z, ■■■,\f2y-i)z in Eq. (37) 
manifests the same periodicity with r as does its originating sequence /j, < j < 2^ — 1. 




(36) 



3. Select an n. For Each j in register Y, Place the Remainder fj = rv' (Modulo N) Into Register Z. 




(37) 
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4- Measure the State of Register Z. 



The entire computer now is in the state represented by of Eq. (37). The objective of the next three steps in 

the algorithm is to extract the vahic of r from the just discussed periodicity of '5^^. Note that although wc know ^f^^ 
has the form given in Eq. (37), until we begin making measurements on register Z we can have no idea of what values 
of fj actually are appearing in Eq. (37). Moreover the wave function collapse discussed under Subheading III.B.2 
means that. any single measurement on register Z , though it surely will reveal one of the values of fj appearing in 
Eq. (37), automatically will destroy all information about the other values of fj. Nevertheless the next step in the 
algorithm is to measure the state of registcir Z. Suppose this register Z measurement on the computer in the state 
represented by Eq. (37) yields the particular value fk (of the r possible values /o = 1, , /i, /2, /r-i)- Then after 
the measurement the wave function of register Y takes its third stage form 

Q-i 

^-f = Q-V2^|A; + 6r)r, (39) 

6=0 

where, again as discussed under Subheading III.B.2: Eq. (39) has retained those and only those \j)Y in Eq. (37) 
which are multiplied by \ fk)z', Q equals the number of terms in Eq. (37) containing the factor \fk)z', and the factor 
Q~^l'^ is necessary to guarantee the wave function ^'y? of Eq. (39) is normalized, consistent with the Born rule.^ 

To help comprehend the structure of Eq. (39) and to see how the magnitude of Q is estimated, let us return to our 
iV = 55, n =16, r = 5 illustrative case. Suppose the result of the register Z measurement on the computer in the 
state represented by Eq. (38) had been fj = 36. Then after the measurement the wave function of register Y in this 
third stage of the operation of the algorithm, was 

*f = Q-V2{|2)y + |7)r + |12)r + ... + |2 + 5(Q - 1))}. (40) 

Evidently the measurement has shifted the dependence on r, from the periodicity with r of the seqence \ fj)z in Eq. 
(37), to the periodicity of an arithmetic progression (with common difference r) of the integers j = k + br indexing 
the computational basis wave functions appearing in Eq. (39). It is evident from Eq. (40) that the value of Q in 
Eq. (39) is determined by the condition that k + r(Q — 1) cannot exceed 2^ — 1, the largest j appearing in Eq. (37). 
Since < k <r, and Q is an integer by definition, this condition implies 

Q= [r-^(2^-l-fc)] + l, (41) 

with [x] denoting the largest integer < x, as previously. We see that unless 2^ /r is an integer. Q in Eq. (39) equals 
either [2''/r] or [2^/r] + 1, depending on the value of k; for the large N cases of interest herein, either of these two 
possible values of Q is very well approximated by 2^/r, because r < N/2 (see Subsection IV. B) whereas 2^ > (as 
noted imder Subheading III.C.l). When 2^/r is an integer, however, i.e., when r happens to be a power of 2 (as can 
happen, recall the illustrative fj sequence = 55, n = 12 discussed in Subsection III. A), Eq. (41) makes Q exactly 
equal to 2^/r (no approximation needed) for every allowed value of k. 



5. Perform a Quantum Fourier Transform Operation on the Register Y Wave Function. 

The desired r finally is extracted from of Eq. (39) via a quantum Fourier transform operation Uft- The 
operation Uft transforms any state \j)Y in register Y to 



g2^»JC/2»|^^^_ (42) 

c=0 



,V2 

After the operation \Jft, therefore, the wave function for register Y takes its fourth stage form 



2"-l Q-1 

*f = Uft^Ip" = (2^g)-i/2 J2 J2 6^^'^''+^'^'/^" \c)y. (43) 

c=0 6=0 

It has been shown^'*'^^ that UpT can be written as a product of universal quantum gates, implying that TJft is 
unitary, as we know it is required to be. The number of gates required is^"''^^ 0{L'^). 
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In Eq. (43) the coefficient of any given \c)y is a geometric series, i.e., is trivially summable. Performing the sum 
yields 

2^—1 27rircQ/2^ 2^ — 1 . / I^V\ 

*f = (2^Q)-i/2 y e^"^-/^" \ . \c)y = (2«Q)-^/2 V e^-i'^'^'^\-irciQ-i)/2y sm[7rrcQ 2 ) 
y \ ^ l_g2«rc/2!' I V ^ sin(7rrc/2!') ' ' ^ ' 

c=0 c=0 ^ ' 



6. Measure the State of the Y Register. 

This measurement will find the Y register in some particular state \c)y- The probability Pc of finding the state \c)y 
is given by the square of the absolute value of the coefficient of |c)i' in Eq. (44), namely 

sm {nrc/2V) 

It is worth remarking that because the step 5 operation XJft docs not involve the Z register, the same Eq. (45) 
for the probability of finding the Y register in the state \c)y would obtain even if the step 4 measurement of the Z 
register's state had been postponed to the present step, i.e., even if the states of both registers had been simultaneously 
measured after performance of the quantum Fourier transform operation, as Shor^ and Volovich^^ prescribe. For this 
paper's pedagogical purpose, however, it is preferable to measure the states of the two registers in two separate steps, 
as Williams and Clearwater^^ also recognize. 

In order to grasp the implications of Eq. (45), it is helpful to consider first the exceptional case that the order r is 
a power of 2. In this circumstance, Q exactly equals 2^/r as explained under Subheading III.C.4. Correspondingly 
Eq. (45) becomes 

p. = mr^-4f^y (46) 

sm {TTrc/^y) 

Because c is an integer < c < 2^ — 1, Eq. (46) implies Pc = for any c other than values of c for which rc/2^ is an 
integer d, as can occur since 2^/r now is an integer. For such exceptional values of c, namely the values of c for which 

f--=0, (47) 

the right side of Eq. (46) becomes 0/0 and we have to return to Eq. (43), wherein we see that except for the common 

factor (52-n-jfec/2" QvQj-y term in the sum over b for given c is unity. The number of terms in the sum is Q. So when 
r is a power of 2 and the Y register is in the state described by the wave function of Eq. (43), the probability 
Pc that the Y register will be found in the basis state \c)y is zero except when c satisfies Eq. (47), in which event 
Pc = {2yQ)~^Q^ = 1/r. Moreover, since c <2'', the only values of c that can be observed arc those corresponding to 
the integers d in the range < d < r; thus the total probability of observing these values of c is rPc = 1, as of course 
it must be. 

Consider now the more general circumstance that the order r is not purely a power of 2. Then there no longer 
exist values of c which satisfy Eq. (47) for every integer d, < d < r; in fact if r is odd one sees there are no values 
of c satisfying Eq. (47). Also we know from the discussion imdcr Subheading III.C.4 that Q - 2^/r now equals a 
non-integer ^, where -1 <^ <1. Accordingly, when r is not a power of 2 the numerator in Eq. (45) is not zero except 
possibly at a limited number of very special values of c; in other words for most, perhaps all values of c, Pc now is not 
zero. Nevertheless for each integer d in the allowed range the probability of observing the result c in a measurement 
on the Y register remains large for, and only for, those exceptional values of c which though no longer satisfying 
Eq. (47)-come close to doing so. To quantify this assertion note first that because r <N/2, the maximum allowed 
value of d/r (namely 1 - 1/r) surely is less than the maximum allowed value of c/2-' (namely 1 - 1/2^ which is > 1 - 
1/ N"^). Thus, since the spacing between successive values of c/2^ is 2~^, now every allowed value of d/r either exactly 
satisfies Eq. (47) for some value of c or else differs from some c/2^ by no more than 2~^/2. In other words when r is 
not purely a power of 2 Eq. (47) is replaced by 

I - 7 < 2-V2, (48) 
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with the assurance that for eaeh allowed value of d/r there exists a single c = ci satisfying Eq. (48) (except when 
the equality holds for such a ci, in which event the equality holds for a second c = C2 = ci ± 1, corresponding to d/r 
lying exactly halfway between two successive values of c/2^). When c satisfies Eq. (48), therefore, we have 

^ = -+ e2-y, (49) 
2v r 

where -1/2 < e < 1/2. 

Employing Eq. (49) in Eq. (45), the probability of finding the Y register in this \c)y state now (when r no longer 
is a power of 2) is seen to be 

P _ .o.o^-i Sin^(^'^gQ/2") > Q sm'{7TreQ/2y) 

n - (^ Q) gjj^2(^^^/2^) - 2y {7rrsQ/2yr ' ^ ' 

using the fact that sinx < x; the equality in Eq. (50) holds only when £ = 0. Because 2^ is so very large compared 
to both unity and r <N/2, estimating the right side of Eq. (50) via replacement of Q by 2^/r (although Q actually 
differs from 2^/r by a quantity ^, |^| < 1) can be scon to introduce an inconsequential error; in other words the 
argument of the sine function on the right side of Eq. (50) may be taken to be we. Hence Eq. (50) yields 

„ ^ _isin^7r£ ^ 1 4 . 

where the second inequality results from recognizing that sina;/x is a decreasing function of x in the range 0< a: < tt, 
and then replacing |£| by its maximum allowed value 1/2. Since there is such a c and associated Pc for each of the r 
allowed values of d in Eq. (48), we conclude that even when r is not a power of 2 the total probability P = rPc of 
finding the Y register in a state \c)y for which c satisfies Eq. (48) is not less than 4/7r^ = 0.4. 

This just stated result for P has been obtained by Ekcrt and Josza"*; it is larger than the value of P originally 
quoted by Shor."^ It is clear from its derivation, however, that this lower bound of 0.4 (though rigorously derived) 
considerably underestimates the magnitude of P that is likely to be encountered in practice. For example, if in Eq. 
(51) |£| is replaced not by its maximum value but rather by its average value 1/4, then Eq. (51) yields P^ > S/rir'^, 
corresponding to P > 0.81. Use of the average value of |£| to estimate P is reasonable because in general the value of 
£ depends on d, as can be seen from Eq. (49) recollecting that 2^/r is not an integer unless r is a power of 2. 



7. Infer the Value of r. 

After a value of c has been obtained, i.e., after the state measurement on register Y prescribed in the previous step, 
it still is necessary to infer the value of r. To help understand how this inference is accomplished, I observe first that, 
because r is less than N/2, there can be only one permissible fraction d/r satisfying Eq. (48) for any given c; here 
"permissible" means of course that d is an integer <d < r < N/2. To prove this assertion note that if di/ri and 
d2/r2 are two distinct permissible fractions, i.e., if rfi/ri ^ d2/r2, then 

di d2 
n r2 

since when di/ri ^ d2/r2,thc integer (rfir2 — (^2^1) cannot equal 0. On the othe hand if these rfi/ri and d2/r2 each 
satisfy Eq. (48) for the same c, 

<2(2-V2) = 2-^<^. (53) 

Since Eqs. (52) and (53) are inconsistent, the impossibility of finding two distinct d/r satisfying Eq. (48) for the same 
c is proved.. 

Suppose now that our state measurement on the Y register has yielded a |c)y state whose c satisfies Eq. (48). The 
actual evaluation of this (now known to be unique) d/r, and thence r, from Eq. (48) is performed by expanding c/2^ 
into a continued fraction, as Shor^ originally proposed. I shall not detail herein the construction of continued fraction 
expansions; such quite comprehensible discussions are readily found^'^, and I do provide an illustrative expansion 
below as well as (under Subheading IV. D. 2) an explanation of the relation between continued fraction expansions and 
gcd calculations. Suffice it to say that the continued fraction expansion of any rational number x provides a series of 



dir2 — d2ri 
rir2 



1 4 
rir2 N'^' 
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fractions (with each fraction in lowest terms) called conver gents to x, such that each successive convergent furnishes 
an improved approximation to x. A key theorem^^ is: Let a/6 be a fraction satisfying 



Then a/b is one of the continued fraction convergents to x. Eq. (48) has the form of Eq. (54), with x = and 
a/b = d/r; since 2^ > N'^, the right side of Eq. (48) is less than {2N'^)^^, which in turn is less than {2r'^)^^ because 
r <N/2. Therefore, by this just quoted theorem, d/r must be one of the continued fraction convergents to i.e., 
expanding c/2'' in its series of continued fraction convergents inevitably will yield d/r in lowest terms. Note that 
this result demonstrates the critical importance of choosing the size y of the Y register >>N. Indeed, if 2^ < iV^/4, 
the right side of Eq. (48) would be > 2/N^, i.e., no longer would ensure that d/r is one of the continued fraction 
convergents to c/2^. Similarly, if 2^ kN'^/A, the right side of Eq. (53) would be >4:/N^, i.e., Eq. (53) no longer 
would be inconsistent with Eq. (52), implying that it now no longer is guaranteed there is only one permissible d/r 
satisfying Eq. (48). 

Let me illustrate this beautifully simple continued fraction method of determining d/r. To factor our illustrative 
N = 55 via Shor's algorithm a Y register of size y = 12 qubits will be employed, as prescribed under Subheading 
IILC.l (2" = 2048 <N^ = 3025 <2y = 4096). For this A'' the order of n = 37 is r = 20, the largest possible value of 
r for this N (as stated under the same Subheading). For d = 11, the value of d/r is exactly 0.55. We have 2252/4096 
= 0.54980; 2253/4096 = 0.55005; and 2-^/2 = 0.00012, which is <0.00020 = 0.55 - 2252/4096, but is >0.00005 = 
2253/4096 - 0.55. Then if we assume the state measurement on the Y register has yielded the state \c)y consistent 
with Eq. (48) for r = 20 and d = 11, the value of c must have been 2253. We now expand 2253/4096 in a continued 
fraction: 



2253 1 1 1 1 1 1 



4096 4096 1 I 1843 -I I 1 -1 I 1 1 I 1 1 I 1 1 _| 1 

^ZOJ ^^SJ jg^ 1+1843 +153? 1+ 1 I 1,, - 



(55) 



and so on. Dropping the fraction 410/1843 in the expression to the right of the fourth equality sign in Eq. (55) yields 
the first convergent, namely 1/2; dropping the fraction 203/410 in the expression to the left of the last equality in 
Eq. (55) yields the second convergent, namely 5/9 = 0.5555. Each of these two convergents differs from 2253/4096 
by an amount whose absolute value exceeds 0.00012, i.e., each of these convergents fails to satisfy Eq. (48) and so 
cannot equal the desired d/r. Lo and behold, however, the third convergent, obtained by dropping the fraction 6/203 
in the expression to the right of the last equality in Eq. (55), is precisely 11/20, confirming the theorem quoted in 
the preceding paragraph. Moreover, because we know r <N/2, which in this illustrative case is 55/2, the result that 
d/r = 11/20 immediately implies that r = 20, because any other fraction equal to 11/20, e.g., 22/40, inevitably has 
a denominator >55/2. 

I next observe that because r < N/2, not merely <N, it follows from Eq. (54) that even if the right side of Eq. 
(48) had been replaced by 2/N^, values of c/2^ satisfying the thus modified Eq. (48) would have continued fraction 
convergents equal to d/r. But 2^ >N'^ implies 2/2^ < 2/N'^. In other words if a state measurement on the Y register 
should yield a |c)y whose c satisfies 



c d 
2y ~ r 



< 2{2-y), (56) 



that c/2^ also will have d/r as a continued fraction convergent of c/2^ even though the value of c may not satisfy Eq. 
(48). Therefore we now have another reason (in addition to the previously explained desirability of using an average 
|£|) for asserting that the quantity 4/7r^ quoted following Eq. (51) greatly underestimates the probability of measuring 
states \c)y which can yield d/r. To more accurately estimate this probability, note that if 0/2^ > d/r satisfies Eq. 
(48), then each of c — 2, c — 1, c and c + 1 will satisfy Eq. (56). Similarly, if c/2^ < d/r satisfies Eq. (48), then each 
of c — 1, c, c + 1 and c + 2 will satisfy Eq. (56). In either case, adding the appropriate four Pc from Eq. (45), using 
sin a; < a; as in Eq. (50), and approximating Q by 2^/r as was done in deriving Eq. (51), we find that the probability 
of measuring a state \c)y which will have d/r as a continued fraction convergent of c/2^ is 

sin^ ns / 1 1 1 1 



ttV V(l + e)2 + £2 + (i_^)2 + (2_e)2 j ' (5^) 

where now < e < 1/2, and the prime on P^. indicates that we have summed over the appropriate four P^ as 
explained above. For £ = 1/2, we obtain P^ = 80/97rV = 0.90/ r; using the average s = 1/4 we obtain P^ = 0.935/r. 
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Returning to our illustrative continued fraction expansion, it is readily verified that each of the continued fraction 
expansions of 2251/4096, 2252/4096 and 2254/4096, like the Eq. (55) expansion of 2253/4096, do have 11/20 as a 
convergent, consistent with our employment of Eq. (57) to estimate the probability of correctly inferring d/r from a 
state measurement \c)y- 

8. Repeat Steps 2 Through 7 Until Factorization of N is Achieved. 

Inferring the value of r need not immediately lead to factorization of N, however. In the first place, as was mentioned 
in Subsection III. A, the probability that r will meet the necessary requirements for being able to factor N, namely 
that r is even and satisfies Eq. (14), is^^ only about 1/2. Thus although the probability of being able to infer a 
d/r via a single measurement of the Y register is so high, namely over 90%, nevertheless on the average it will be 
necessary to run through the entire sequence of steps 2 through 7 at least twice before a. d/r whose r can be employed 
to factor N is obtained. The entire sequence must be repeated starting from step 2 (we don't have to make any new 
decisions about the sizes of the registers) because after step 7 the Y register is in whatever state \c)y was measured. 
The wave function of this state is nothing like the initial loading wave function ^'y^ of Eq. (31) from which the Shor 
algorithm operations departed, beginning with step 3; also, unless we already have cleared register Z to the state \0)z, 
the operations described in steps 2 and 3 will not yield the desired of Eq. (37). In these repetitions, although 
the Y register wave function always will be brought to its initial loading form Eq. (31), i.e., although step 2 always 
will be the same, the choice of n in step 3 had better be different; otherwise carrying the algorithm through to step 
7 merely will again yield an r which cannot be employed to factor A'^. 

Furthermore, even granting that the n selected in step 3 does possess an order r which is employable to factor N, 
inferring r from the computed d/r may not be as simple as the discussion under the immediately preceding Subheading 
has suggested. Suppose, again returning to our illustrative iV = 55, n = 37, r = 20 example, the register Y state 
measurement had yielded c = 2048, which for r = 20 satisfies Eq. (48) with d = 10. But the computer operator doesn't 
know r = 20; all the operator knows is that 2048/4096 = 1/2, the sole convergent (which has to be in lowest terms 
remember) to 2048/4096. The operator immediately will discover (37)^ = 49 ^ 1 (mod 55), so that 2 surely is not the 
order of 37, but then what? Each of the fractions 2/4, 3/6, 4/8,..., 13/26, equals 1/2 and has a denominator <55/2, 
i.e., each of these denominators could be the desired r. In principle the operator could test the powers (37)^, (37)^, 
(37)^, ... (mod 55) until he/she came to (37)^° = 1 (mod 55). For the large N of interest, however, e.g., RSA-309, 
persistently trying to determine r in this crude fashion after the register Y measurement has yielded a convergent 
with a denominator b for which b «N and n'' ^1 (mod N), obviously would be ridiculous and would negate the 
whole point of using Shor's algorithm. Shor^ suggests the operator should try a few small multiples of b, e.g., 2b and 
36; but after finding n^^ and ri^^ ^1 (mod N), the operator seemingly would have little choice but to repeat steps 2 
through 7 (this time using the same value of n of course), in the hope that the newly measured 0/2^ would have a 
convergent whose denominator actually was r, not a factor of r. 

How many times the operator may expect to have to repeat steps 2 through 7 before reliably inferring r (still 
assuming the operator has selected an n possessing an employable r) is difficult to say. A seeming overestimate of 
the expected number of such repetitions follows from considerations first advanced by Shor^ and refined by Ekart 
and Josza.^ The number of positive integers d less than r that are coprime to r is (f>{r), where (f> is Euler's totient 
function^'^ (the subject of Subsection IV. B). Then if P' is the probability (equal to at least 0.9 as we have seen) 
that a measurement on the Y register will yield a 0/2^ with a convergent equal to some d/r, < d <r, then P" = 
P'(j){r)/r is the probability that the measurement will yield a convergent equal to a, d/r wherein d is prime to r. Ekart 
and Josza* quote the theorem^^ that for sufficiently large r 

ct>{r) ^ 0.56 ^ 1.17 
r ~ Inlnr log2log2r 

Because the typical r is expected to increase as N increases, Eq. (58) suggests that whatever may be the number of 
repetitions 2 through 7 otherwise required (e.g., repetitions because r is not always employable to factor N) those 
repetitions might need to be increased by a factor of about log2 log2 r because of the just discussed complications 
associated with fractions d/r in Eq. (56) wherein d is not coprime to r. 

This just estimated increase in the required number of repetitions probably is an overestimate because it does not 
take into account the likelihood (as explained in the penultimate paragraph) that the operator will infer r without 
recourse to repetitions when r is only a small multiple of the denominator b of the measured convergent, e.g., when 
b equals r/2 or r/3. The operator also can minimize the number of required repetitions by recognizing (as Shor"^ also 
remarks) that if two measured convergents have denominators bi «N/2 and &2 «N/2 with bi coprime to &2, then 
the only way for r to be a multiple of each of b\ and 62 is for r to be a multiple of 6162, which now may be sufficiently 
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(58) 



large to ensure that r is either 2&i52. or 3&i&2.- For instance, returning once again to our iUustrative A'^ = 55, n = 37, 
r = 20 example, if after obtaining the convergent 1/2 the operator were to repeat steps 2 through 7 with the same n, 
and if this repeat should yield the convergent 3/5, the operator would infer with high probability (greater than 0.9 
as discussed above) that r is a multiple of 10; once having discovered that (37)^" = 34 (mod 55), the operator would 
infer with the same high probability that r = 20, because 30 = 3(10) is >55/2 and therefore cannot be r. Indeed once 
having verified that (37)^° = 1 (mod 55), the knowledge that (37)^" = 34 (mod 55) immediately enables the factors 
5 and 11 of = 55 to be determined, precisely as was illustrated at the end of Subsection III. A. 

D. Concluding Remarks. 

The foregoing completes this paper's presentation of the operations constituting Shor's algorithm. The following 
added remarks seem appropriate, however. The algorithm involves the application of unitary operations at steps 2, 
3 and 5. The estimated numbers of gates required to accomplish each of these steps are stated in the text under 
their respective subheadings; denote these estimated numbers by i'q2, i^qs and i/q^ respectively. The estimated total 
number of gates required, denoted by i^q in Eq. (28), equals Vq2 + Vqs, + Vq^- We see that in the limit of very large N 
the estimates Vq2 and Vq^ become negligible compared to Vq^. Accordingly Eq. (28) equates Vq to Uqz, in agreement 
with conventional procedure. Admittedly Eq. (28) has not taken into account the operations, gate or otherwise, 
required to perform the state measurements postulated under the algorithm steps 4 and 6. We have seen that the 
quantum computer can carry out the algorithm with no more than about 3L qubits, however; it is difficult to sec why 
the required number Vm of measurement operations, including the post-measurement operations needed to restore 
the computer wave function to its (for the purpose of the algorithm, recall the discussion under Subheading III.C.2) 

starting form ^f^-* = 10)1-10) z should be other than proportional to the number of qubits. Consequently the failure to 
include state measurement operations in no way invalidates employing Eq. (28), which grows somewhat faster than 
L"^ with increasing to estimate the growth with N of the computing effort required to perform a factorization of 

A'^ using Shor's algorithm. . 

If in practice repetitions of the algorithm steps are necessary, as the discussion under the step 8 Subheading indicates 
almost always will be the case, then those repetitions (each of which necessitates a new setting up of the gates) should 
be taken into account in any estimate, such as Eq. (28), of the total number of gates required to determine an order r 
permitting factorization of N. Because the Order Of symbol O defined in Eq. (29) has been included in Eq. (28), any 
required number of repetitions which does not increase with N (e.g., the expected number of repetitions associated 
with the fact that some r will not be employable to factor N), does not demand any correction of Eq. (28). On 
the other hand the number of repetitions suggested by Eq. (58), although very likely a considerable overestimate of 
the actual number of required repetitions associated with the desirability of measuring a d/r for which d. is prime to 
r, probably does demand some modification of Eq. (28). If we assume that the typical r < A''/2 tends to be some 
fixed fraction of then for large TV we can replace logj logj r with logj log2 A'', therewith concluding that our earlier 
text coucoriiing Erj. (28) should have been supplemented by: An upper bomid PqubiN), on the expected nmnber 
of universal quantum gates that actually will have to be employed in a Shor algorithm determination of an order r 
permitting factorization of A'' is obtained via multiplication of fqiN) in Eq. (28) by log2i, yielding 

yqub{N) = O[£2(log2 L)2(log2 log2 L)]. (59) 

Eq. (59) only very minimally weakens the conclusions drawn earlier from comparison of Eqs. (7) and (28), or from 
computing the actual magnitude of Vq{N) given by Eq. (29). For instance, whereas previously we concluded that a 
quantum computer which could factor RSA-309 in two weeks time should be able to factor RSA-617 in no more than 
about 9 weeks, Eq. (59) leads to the conclusion that a quantum computer which surely can factor RSA-309 in no 
more than 2 weeks will factor RSA-617 in at most 10 weeks. 

After a c has been measured, as described under the step 6 Subheading, the following calculations still must be 
performed: (i) infer an r from the measured c, which generally involves a continued fraction expansion as discussed 
under the Step 7 Subheading; (ii) verify that the inferred r satisfies Eqs. (10), (12) and (14) (as it must if this r is to 
be employable to factor N), which involves computing rf and rf^"^ (mod A''); and (iii) then actually obtain the factors 
p and q of A^, which process involves computing greatest common divisors as discussed immediately following Eq. 
(14). At present it is not contemplated that any of these required calculations will be accomplished by any computer 
other than a purely classical one. The efforts required to accomplish these computations have not been included in Eq. 
(28), nor could they be, because Eq. (28) estimates the number of universal quantum gates required, not the number 
of classical computer bit operations as in Eq. (7). On the other hand the efforts to perform these classical calculations 
are not irrelevant to any realistic estimate of the potential utility of Shor's algorithm for factoring increasingly large 
N. It is pertinent to remark, therefore, that (see Section IV.D) for none of the computations listed under (i) - (iii) 
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immediately above does the number of required bit operations increase with N more rapidly than the right side of 
Eq. (59). Consequently Eq. (59) correctly exhibits the maximum expected growth with increasing N — pq of the 
total computational efforts, quantum plus classical, required to complete a factorization of N using Shor's algorithm. 
Correspondingly, the conclusions we have drawn from comparisons of Eqs. (7) and (28) remain valid, except for the 
very minimal weakening discussed immediately beneath Eq. (59), even though the derivation of Eq. (28) ignored the 
classical computer calculations presently inherent in the use of Shor's algorithm to factor N. 

Until Shor produced his algorithm it was generally believed that the computational effort required to factor 
= pq grows more rapidly with increasing N than any polynomial in L = logj N, as Eq. (7) manifests. Shor's 
demonstration that use of a quantum computer could decrease this growth to slower than L'^ was astonishing, therefore, 
and has greatly accelerated attempts to construct an actually functioning quantum computer. The key Shor algorithm 
operation, the operation that enables the greatly diminished growth of the computational effort with A'', is the quantum 
Fourier transform \Jft operation discussed under the step 5 Subheading. The quantum Fourier transform is a direct 
generalization (to quantum mechanical basis states) of the classical computing discrete Fourier transform, which 
in turn is nothing more than a discretized (summation rather than integration) version of the conventional Fourier 
integral transform. Thus it does not seem surprising that application of \Jft to the wave function "ify^ of Eq. (39) 
yields a new wave function, namely of Eq. (44), wherein the probability Pc [that a measurement on the Y 
register will yield the basis state \c)y] is large only for those values of c from which the periodicity with r inherent 
in Eq. (40) can be inferred. What is very remarkable, however, and what makes possible the comparatively slow 
increase with N of VqiN) in Eq. (28), is the fact that although the discrete Fourier transform calculation requires 
0{NL) bit operations®^, performance of \Jft can be accomplished with only 0{L^) universal quantum gates, as 
stated immediately following Eq. (43). It must be remembered that Ui^p is a 2^ x 2^ matrix, i.e., at least an N"^ x A''^ 
unitary matrix; because an arbitrary unitary matrix of this dimensionality contains free parameters, in general one 
expects that reproducing a given A'^^ x A^^ unitary matrix will require a sequence of no fewer than A'^'*/16 one-qubit 
and two-qubit gates. This observation, based on trivial dimensional considerations, suggests that for most classical 
computer algorithms the growth of computational effort with number size will not be importantly diminished merely 
by recasting the algorithm into a form usable in a quantum computer. 

Finally we remark that factorization of a number N = pq via a quantum computer using Shor's algorithm actually 
has been accomplished®^; although the number factored, namely 15, is the smallest possible product of odd primes, 
the accomplishment assuredly is notable. It also is notable, however, that because 0(15) =4x2 = 8 the only possible 
orders r were r = 2 and r = 4, meaning that in this quantum computer factoring demonstration the value of r could 
be inferred from Eq. (47) for any chosen n coprimc to and <15, without the complications attendant upon the much 
more usual circumstance that r has to be inferred from Eq. (48). Nor could this experiment test the feasibility of 
determining r in the likely event that repetitions of steps 2 through 7 of the algorithm will be required, as discussed 
under Subheading III.C.8. 



IV. APPENDIX. PERTINENT NUMBER THEORETIC RESULTS. 



This Appendix presents the various number theoretic derivations and other results referred to in the preceding 
Sections of this paper. I take this opportunity to thank Paul Reilly for numerous enlightening discussions, especially 
on the RSA system. I am indebted to Joseph Burdis for carefully checking the manuscript, including its references. I 
also am indebted to Sam Scheinman for data on RSA enciphering and deciphering. 



A. Congruence Manipulations. Illustrative RSA Operations. Periodicity of Remainders fj. 



Comparison of Eqs. (1) and (5) illustrates the proposition that (subject to the important proviso that all the 
congruences must have the same modulus m) congruences like Eqs. (l)-(5) have the useful property that in many 
respects they can be manipulated as if they were equalities, i.e., as if the congruence symbol = were the equality 
symbol =. For example Eq. (1) and 

x = y (mod m) (60) 

imply both 

ax = by (mod m). (61) 

and 
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bx = ay (mod m), (62) 

Accordingly Eq. (1) implies 

a' = ¥ (mod m). (63) 

for any positive integer z. Eqs. (61)-(63) can be trivially demonstrated^^ remembering that Eq. (1) means a = b+wm, 
w some positive or negative integer. There are a few permissible manipulations of equalities that have no congruence 
counterparts, but any such manipulations are not relevant to this paper. Unless explanatory comments seem required, 
therefore, the remainder of this Appendix will manipulate congruences as if they were equalities without further ado. 

The use of congruence manipulations to conveniently compute Alice's 5* = 21, 14, 51, 1, 13, 27, 10, 1, 9, 8, 49, 
51 (given at the end of Subsection II. B) from her illustrative C = 21, 9, 6, 1, 7, 3, 10, 1, 4, 2, 14, 6 (quoted in 
Subsection II. A) will now be exemplified. Our illustrative RSA key number and encryption exponent, to be inserted 
into Eq. (2) along with each c in C are TV = 55 and e = 23 respectively. Consider initially Alice's first c = 21. Instead 
of determining the corresponding s by tediously computing = (21)^^ and then dividing by 55, Alice proceeds as 
follows: 

(21)2 = = 1 (inod 55), (64) 

(21)22 ^ (1^11 ^ 1 ^^^^ ^gg-) 

(21)23 = (21)(21)22 = 21(1) = 21 (mod 55). (66) 

So the first s in 5 turns out to equal the first c = 21. The second s is obtained not quite as simply, but surely a lot 
more easily than having to exactly compute (9)2^, namely: 

(9)^ = 81 = 26 (mod 55),. (67) 

(9)"^ = (26)2 = 676 = 16 (mod 55), (68) 

(9)* = (16)2 = 256 = 36 = -19 (mod 55), (69) 

(9)^° = (-19)(26) = -494 = -54 = 1 (mod 55), (70) 

(9)20 = (1)2 = 1 (mod 55), (71) 

(9)23 = (9)20(9)2(9) = (26)(9) = 234 = 14 (mod 55). (72) 

Thus the second s is 14. Similar congruence manipulations on C readily yield the above-quoted complete S, as readers 
of this paper now readily can verify. Similarly it is readily verified that Bob's secret decryption exponent d = 7 really 
does decipher this S into C, namely that in accordance with Eq. (3): (21)^ = (21)(21)^ = 21 (mod 55), making use 
of Eq. (64) above; (14)2 ^ ;^gg ^ ^^^-^4 ^ qq^ = 26, (14)^ = (14)(31)(26) = (14)(36) = 9 (mod 55), etc. 

The permissibility of these congruence manipulations also immediately implies the periodicity of the remainders fj 
defined by Eq. (8). Using Eq. (10), we see that fj+r = n^~^^ = n^n'" = = fj (mod pq), implying = fj, since 
by definition all the fj are positive numbers <N; similarly fj+2r = n^~^^n^ = n^^^ = fj (mod pq), etc. It also is 
readily seen that all the fj, 1 < j < r, are different. For suppose fa = fb, where each oia^b lies in the just specified 
range of j. Suppose further a <b. Then = (mod pq), meaning — n°' = n°'{n^~°' — 1) is divisible by pq. This 
means — 1 must be divisible by pq, because n has been chosen to be coprime to pq. On the other hand it is not 
possible to have n*""" = 1 (mod pq) because by definition r is the smallest value of j for which nP = 1 (mod pq). 
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B. Euler's Totient Function. Proof That For N = pq the Order r < N/2. 



For any positive integer m, Euler's totient function^^ (j){m) is the number of positive integers less than m that are 
coprime to m; by definition (f){m) always is <m. Euler proved^^ that if a is coprime to m, then 

a'^("' = 1 (mod m). (73) 

Let us calculate (t){N) for key numbers N = pq, where p and q are odd primes. The only numbers <A'' that are not 
coprime to TV are multiples of p and q. There are g — 1 integers p, 2p, ...,{q — l)p less than A^; similarly there are p—1 
multiples of q that are <A^. Since none of these numbers can coincide and be less than N, 

(j,{N)=N-l-[{p-l) + {q-l)]=pq-p-q+l = {p-l){q-l). (74) 

Evidently the RSA </> introduced in Subsection II. A, and then employed in Eqs. (4) and (9), is (f){pq)] the explicit 
dependence on N = pq was dropped in those equations because no possible confusion could result therefrom. Equally 
evidently, Eq. (73) immediately implies Eq. (9). I note parenthetically that if n is not coprime to pq, i.e., if n and 
pq have a common factor a; > 1, then fj in Eq. (8) also must be divisible by x, as is immediately seen remembering 
Eq. (8) means — fj = ypq, y some integer. Consequently Eq. (10) cannot hold for any integer r unless n actually 
is coprime to pq. 

If n is coprime to pq moreover, Eq. (9) is supplemented by 



n 



(p-i)(9-i)/2 = 1 (jnod pq)^ (75) 



where, as in Eq. (9) and always herein, n is coprime to pq. To prove Eq. (75) it is convenient to start from the the 
form taken by Eq. (73) when m is an odd prime p. Evidently (t){p) = p — 1, so that 

aP-'^ = 1 (mod p), (76) 

where a is coprime to p, of course. Eq. (76) is known as Fermat's Little Theorem, stated by him^^ in 1640. Because 
q also is an odd prime, {q — l)/2 is an integer, so that Eq. (76) implies [recall Eq. (63)] 

a(p-i)(9-i)/2 = 1 (mod p). (77) 

But if a also is coprime to q, then it similarly is true that 

a(9-i)(p-i)/2 = 1 (jnod q). (78) 

If a is coprime to both p and q, however, then a is coprime to pq, i.e., a is an n as defined at the outset of Subsection 
III. A. For any positive integer z, furthermore, if z — 1 is separately divisible by a prime p and by another prime q, 
then z — 1 is divisible by the product pq. Hence the pair of Eqs. (77) and (78) imply Eq. (75). Eq. (75) in turn 
implies that the order r of any n modulo N = pq is < <j){N)/2, so that r indeed is <N/2, an inequality that is cruical 
to the derivation of the important Eq. (57). 

The probability that a randomly selected positive integer <A'' will be coprime to N obviously is 

^{N) p-l + q-1 

N-l~ N-1 ' 

using Eq. (74). For actual RSA key numbers, e.g., RSA-309, the right side of Eq. (79) will be indistinguishable from 
unity for all practical purposes. For instance if the smaller of p and q is not less than A^^/^, the larger of p and q 
will be no greater than N^/'^, and the right side of Eq. (79) is approximately 1 - N^/'^ which, even for a key number 
as small as RSA-155, differs from unity by approximately (10)~^^. Correspondingly for actual RSA key numbers the 
magnitude of 0(iV) can be taken equal to for all practical purposes. It is additionally worth noting that because 
(piN) evidently also equals the number of integers i coprime to in the ranges A^ + 1 < z < 2A^, 2A^ + 1 < i < 3A^, 
etc., the probability that a randomly selected integer <A/'^ (or <2^, with y defined as under Subheading III.C.l) will 
be coprime to N also can be taken equal to unity for all practical purposes. 
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C. Proof the RSA System Correctly Deciphers. 



Eqs. (2) and (3), together with the definition of N, imply 

u = c'"^^{c''f {mod pq). (80) 

Because d and e are positive integers by definition, and recalling the definition of ^ = 4>{N), Eq. (4) implies 

de = l + Z(t)=l + z{p-l){q-l), (81) 

where z is a positive integer. Because by definition u and c arc positive integers <A^ = pq, knowing u = c (mod pq) 
implies u = c. Thus to prove the RSA system enables Bob to correctly decipher Alice's message I merely need show 
that 

^i+z(p-i)(9-i) = ^ (jj^Q^ (82) 

If c is coprime to N then the now demonstrated Eq. (9) implies 

^ ^ (j^Q^ pq^^ (g3) 

from which Eq. (82) immediately follows after multiplying both sides of Eq. (83) by c. If c is not coprime to pq, 
c < N is divisible by one of p and q but not by both. Suppose for concreteness c is divisible by q, i.e., suppose c = bq, 
b a positive integer <p. Then Eq. (76) holds for a = c and implies 

^z(p-l){q-l) = (^qJ (84) 

But if a; — y is divisible by p, then q{x — y) is divisible by qp, implying further that bq{x — y) is divisible by qp. Hence 
Eq. (84) impUes 

ft^c^(g-i)(p-i) = 5g [mod qp). (85) 

Eq. (85) is Eq. (82) in the circumstance that c = bq. Correspondingly Eq. (82) will hold if c is divisible by p. We 
conclude that Eq. (82) holds for every c in Alice's cryptogram whether c is coprime to N or not. This completes the 
demonstration that the RSA system does enable Bob to correctly decipher Alice's cryptogram. 



D. Classical Computer Calculations Pertinent to Shor Algorithm Factorization. 



This Subsection discusses the various classical computer calculations mentioned in earlier Subsections of this paper. 
The results under Subheadings 1 through 3 below are the bases for the assertions made in Subsection III.D about the 
growth with N of the classical computer calculations required to factor N = pq using Shor's algorithm. How Bob 
can determine his decryption exponent d from (j) = 4>{N) and his encryption exponent e (recall Subsection II. A) is 
described under Subheading 4 below. 



1. Greatest Common Divisors. The Euclidean Algorithm. 

A convenient method for computing the gcd of two positive integers was first described by Euclid. My discussion 
of the Euclidean algorithm closely follows Rosen.®® Suppose the integer a; > 1 is the gcd of the two positive integers 
So and si, where sq > Si > 1. The Euclidean algorithm determines x as follows. Divide sq by Si, thereby obtaining 
the remainder S2 > 0. By definition S2 is <si and satisfies 

So = ZqSi + S2, (86) 

where zq is a non- negative integer and < S2 < si. Since a; is a divisor of so and Si, Eq. (86) implies a; is a divisor 
of S2. Proceeding in this fashion, always dividing the previous divisor Sj by the previous remainder Sj+i, one obtains 
a sequence of remainders S2, S3, Sj+i,Sj+2, each of which is a multiple of x. Moreover since each Sj always is 
>Sj+i, the sequence eventually must terminate with some Sk+2 = 0, i.e., eventually there will be the simple equation 

Sk = ZkSk+i. (87) 
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It now can be seen that Sk+i is not merely a multiple of x, the gcd of sq and Si; rather Sk+i = x. Eq. (87) shows 
Sk is a multiple of Sk+i- The preceding equation in the series, namely 

Sfe-i = Zk-iSk + Sk+ii (88) 

then implies Sk-i also is a multiple of Sk+i- Thus, proceeding in this fashion back through the series of equations 
which led from Eq. (86) to Eq. (88), it can be concluded that both sq and si are multiples of Sk+i- Hence Sk+i must 
be a divisor of x, the gcd of sq and si. But we already have shown that a; is a divisor of Sk+i- Consequently x must 
be identical with s^+i, the last remainder before = 0- If Sfc+i = 1, then Si is coprime to sq- 

I will illustrate the use of the Euclidean algorithm to find the factor 5 of N = 55 when, as explained at the end of 
Subsection III.A, for n = 12 it is deduced that /2 + 1 = 35 must be divisible by one of the factors of 55. We have: 55 
= 1x35 + 20; 35 = 1x20 + 15; 20 = 1x15 + 5; 15 = 3x5 + 0. Therefore 5 is the gcd of 55 and 35. Similarly, suppose we 
had decided (unnecessarily for large N as is discussed under Subheading III.C.3) to verify that 12 actually is coprime 
to 55. Now we have: 55 = 4x12 + 7; 12 = 1x7 + 5; 7 = 1x5 + 2; 5 = 2x2 + 1; 2 = 2x 1 + 0. So 1 is the gcd of 12 
and 55, i.e., 12 really is coprime to 55. 

How many classical computer bit operations are required to obtain the gcd of two large numbers A^i and 
via the Euclidean algorithm? Define Li = log2A^i, L2 = log2A^2; as discussed immediately beneath Eq. (29), for 
large Ni,N2 the quantities Li,L2 differ negligibly from the number of digits in the binary expansions of Ni,N2 
respectively. Then according to a theorem by Lame^* the number of divisions needed to find the gcd of A^i and N2 
using the Euclidean algorithm is at most 0{Li), where O is the Order of symbol used in Eq. (28). The number of bit 
operations in any one of those divisions hardly can exceed the number of bit operations in the first of those divisions 
(of Ni by N2), wherein the dividend Ni and divisor A'2 are at their respective maximum values. Although at first 
sight the number of bit operations required to divide A^i by N2 is^^ 0{LiL2), in actuality there exist ^° sophisticated 
classical computer algorithms which for large Ni , A^2 reduce the number of bit operations required for this division 

to 0[Li(l0g2Ll)(l0g2l0g2Li)]. 

Consequently the number of computer bit operations required to obtain the gcd of two large numbers A^^i and N2 
using the Euclidean algorithm surely is 0[if(log2ii)(log2log2l'i)]. Returning now to the discussion in Subsection 
III.D of the classical computer calculations required for factoring using Shor's algorithm, the two numbers whose 
gcd is required always will be no larger than N — pq and 1 + /r/2, where according to Eq. (8) every fj is <A'^ 
by definition. Once an r permitting factorization of N has been inferred [namely an r satisfying Eqs. (11), (12) 
and (14)], only a single gcd computation will be needed in order to complete the factorization of N. No such gcd 
calculation is needed until a so usable r has been inferred. It follows that the number of bit operations required for 
the gcd calculations involved in factoring N = pq using Shor's algorithm will not grow faster with increasing N than 
0[L^(log2i)(log2log2iv)], the same growth rate with L as is given by Eq. (28). 

2. Continued Fraction Expansions. 

Rosen®" explicitly demonstrates that the divisions performed in finding the gcd of the positive integers sq and si 
via the Euclidean algorithm are the same as the divisions performed in constructing the continued fraction expansion 
of the fraction si/sq. Hy way of illustration, suppose we seek the gcd of the integers 2253 and 4096 whose ratio 
was expanded in the continued fraction of Eq. (55). We have: 4096 = 1x2253 + 1843; 2253 = 1x1843 + 410; 1843 
= 4x410 + 203; 410 = 2x203 + 4; and so on. Evidently the divisions performed to obtain these relations indeed 
are identical with those performed in constructing the right side of Eq. (55). Thus to estimate the number of bit 
operations required to compute the continued fraction convergents of any one c/2^ measured as described \mder 
Subheading III.C.6, the result obtained under the immediately preceding Subheading is immediately applicable. It is 
necessary only to observe that for sufficiently large N the value of y — log22^ differs negligibly from 2L = log2A/'^. 
Accordingly, the number of bit operations required to perform a typical continued fraction expansion of a measured 
c/2y should be 0[(2L)^(log22L)(log2log22iy)] = 0[L^(log2i)(log2log2L)], precisely the same result as obtained under 
the previous Subheading for the Shor algorithm gcd calculation. 

Unlike the gcd case, however, a continued fraction expansion is required every time a c/2^ is measured. The expected 
number of repetitions of such measurements has been discussed under Subheading III.C.8 and in Subsection III.D. 
Those discussions indicated that a probable overestimate of the required number of repetitions is log2L, implying that 
the overall number of bit operations required to perform the continued fraction expansions during factoring by Shor's 
algorithm may grow with increasing N as fast as, but no faster than, the right side of Eq. (59). 
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3. Modular Exponentiation. 



Verifying that rf = \ (mod N), and then computing n^/'^ = (mod N), so as hopefuly to factor N = pq via 
Eq. (11), involves so-called modular exponentiation. Volovich"'''* sketches the proof that the number of bit operations 
required to calculate (mod N) on a classical computer is 0[iy^(log2i)(log2log2i)], i.e., grows with A'' as does the 
right side of Eq. (28). As discussed under Subheading III.C.8, a few repetitions of these exponentiations may be 
necessary because the probability that a chosen n will yield an r permitting factorization of N via Eq. (11) is only 
about 1/2. A few more exponentiations may be necessary to rule out, as possible values of r, the denominators b 
(and small multiples thereof) of convergents a/b to measured c/2^ when a/b = d/r but b «r, also as discussed 
under Subheading III.C.8. It does not appear, however, that as many repeated exponentiations ever will be required 
as the 0(log2i) repetition factor inferred from Eq. (58). Consequently, just as under the immediately preceding 
Subheading, the number of bit operations needed to perform the classical computer modular exponentiations that 
arise during factorization via Shor's algorithm may grow with increasing A'' as fast as, but surely no faster than, the 
right side of Eq. (59). 



4. Finding the Decryption Exponent. 

We need to solve Eq. (4) for d, knowing e and <j). There is a known^^ closed formula for (/'(0), the totient function 
of (p (recall Subsection IV.B), in terms of the prime factors of ^. Thus if we could factor ^ we immediately could find 
d. Namely since e is coprime to (p (recall Subsection II. A), Eq. (73) implies 

e"^'"^) = 1 (mod (j)). (89) 

Consequently Eq. (4) is solved by 

d=g4>{4>)-i {mod (p). (90) 

When N is of the magnitude of modern RSA key numbers, factoring a cp = {p — l){q — 1) = N (again recall 
Subsection IV.B) can be difficult, though perhaps not as difficult as factoring N = pq itself. In practice, therefore, d 
probably would be determined as follows. Eq. (4) means there is an integer k such that 

erf = 1 + k(p. (91) 

Eq. (91) is a Diophantine equation in the unknowns k and d, whose solution can be found^^ by working backwards 
from the set of equations constituting the Euclidean algorithm for the gcd of e and (p. 

I will illustrate this just explained method of solving Eq. (4) in our oft-employed illustrative case N = 55 We have 
= 40, and have chosen e = 23 (again recall Susection II. A). The Euclidean algorithm equations for obtaining the 
gcd of 40 and 23 are: 40 = 1x23 + 17; 23 = 1x17 + 6; 17 = 2x6 + 5; 6 = 1x5 + 1; 5 = 5x1 + 0, verifying that 
our e is coprime to our A''. Now, working backwards: 6-5 = 1; 5 = 17- 2x6, so 6 - (17 - 2x6) = 3x6 - 17 = 1; 6 = 
23 - 17, so 3x(23- 17) - 17 = 3x23- 4x17 = 1; 17 = 40 - 23, so 3x23- 4x (40 - 23) = 7x23- 4 x 40 = 1. This last 
equation is of the form of Eq. (91), and implies 7x23 = 1 (mod 40). Therefore the desired d equals 7, as asserted at 
the close of Subsection II. A. 

It is apparent that the computing effort required of Bob in determining his d via this just described procedure is 
utterly negligible compared to the computing effort he will endure in decrypting the many messages he expects to 
receive from Alice. 
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